Smashing Magazine: designing with uncertainty and probabilistic AI systems

Pratik Joglekar’s article in Smashing Magazine examines a structural tension in modern product design: AI systems are probabilistic at their core, but most interfaces present their outputs as definitive answers. This mismatch creates real risk, and understanding it is now part of what it means to design responsibly with AI.

The core problem

Joglekar opens with the Air Canada chatbot case, where a customer was told by an AI assistant that a refund policy existed—a policy that did not in fact exist. The company was held liable, not the AI vendor. This kind of failure does not come from a broken system; it comes from a probabilistic system wrapped in a deterministic interface. The chatbot spoke with certainty where none existed, and the gap between statistical likelihood and actual certainty is where things go wrong.

The broader pattern recurs across product categories. Diagnostic tools, hiring screens, content moderation systems, customer service bots—all share the same structural issue when they present model confidence as factual certainty.

Confidence is not certainty

One of the article’s sharpest points is that high-confidence AI outputs deserve more scrutiny, not less. A model may output an answer with 95% confidence, but that score describes behavior across a distribution of similar inputs—not the probability that this specific answer, in this specific context, is correct. Designers must account for this by examining the training data behind predictions and acknowledging that past distribution does not predict future edge cases.

This means evaluating what an AI does across a wide range of inputs, including low-frequency situations that may matter most to users in critical circumstances. Joglekar frames this as a responsibility for the design process, not just the engineering team.

Experimentation as learning, not validation

Joglekar reframes product experiments in an AI context. Rather than testing features to confirm they work, teams should test to reduce uncertainty—treating each experiment as a way to understand what the model actually does under varying conditions. This shifts the goal from “does it work?” to “what does it do, and when?” An iterative feedback loop that continuously updates assumptions is more valuable than one-off validation cycles, particularly as the model’s training data evolves.

Human oversight as a design feature

The article ends with its most practical point: keeping humans in the review loop is not merely a safety measure, it is a product design decision. Override capability, transparent explanations, and clear accountability for AI decisions are interface features. Designers who build review mechanisms into AI-assisted workflows are also generating feedback data that improves model behavior over time.

Who this is for

Product designers and UX practitioners working on AI-integrated digital products, especially those responsible for interfaces where model outputs drive decisions that affect users—in healthcare, finance, customer service, or any domain where a confidently wrong answer carries real consequences.