Medium: What Punch taught me about AI design workflows

Pranit Jena built a simple 2D maze game called Punch not to ship it, but to use it as a controlled test environment for AI tools. Writing in February 2026, he documents how he evaluated Lovable, Figma Make, Perplexity, Cursor, Claude, and ChatGPT against the same brief and the same project constraints, then drew conclusions about where each tool belongs in a design workflow.

The central finding is a division between what the author calls builders and thinkers. Cursor and Claude are builders — tools oriented toward generating and modifying design artifacts or code. ChatGPT and Gemini function better as thinkers — tools for analyzing requirements, questioning assumptions, and making strategic decisions before production begins. Using a builder where you need a thinker produces confident but misdirected output. Using a thinker where you need a builder produces advice without anything made.

To keep the comparison honest, Jena tested each tool using only the free tier and a single opening prompt per session. This simulates how most practitioners actually start working with an AI tool, without the tailored prompting techniques that experienced users develop over time. The constraint surfaces something real: tools that handle ambiguity well in free-tier single-prompt conditions are tools whose underlying models understand context, not only syntax.

A more specific finding concerns context windows and session drift. When a tool generates output that contradicts a constraint the user set ten prompts earlier, Jena frames that not as a creativity failure but as a memory failure. The tool lost track of the project’s requirements. His practical response is to restate the core constraints at intervals during a long session, particularly when the conversation changes direction.

The article also raises a concern the author calls cognitive debt. When AI handles structural decisions in a design process, the designer stops exercising judgment about those structures. Jena references neuroscience research suggesting that reliance on AI for reasoning tasks reduces coordinated cognitive effort in related domains. The practical takeaway for designers is not to avoid AI, but to remain actively engaged with the decisions the tool is making — checking outputs as a collaborator reviews work, rather than accepting them as authoritative results.

A final note on tool combinations: Jena recommends pairing a builder subscription with a thinker subscription rather than choosing one tool to cover everything. Claude Code paired with Cursor covers the implementation side; a reasoning-capable model covers the planning side. The division of labor reduces the risk of generating polished work in the wrong direction.

This article is useful for individual designers who want a structured way to select between competing AI tools rather than defaulting to whichever they encountered first.