QuantumBlack: A two-layer architecture for agentic software development

QuantumBlack, McKinsey & Company’s AI division, published this article in February 2026 to document what worked — and what did not — when their teams tried to scale AI agents across software development workflows. The central finding is that agents left to self-orchestrate fail predictably at scale: they skip phases, create circular dependencies, and tend to analyze situations rather than act on them.

The two-layer model

The solution the team describes separates two distinct concerns. The first layer is a deterministic orchestration engine — a rule-based system, not an AI — that enforces the sequence of development phases (requirements → architecture → implementation), manages task dependencies, and tracks artifact state through frontmatter. Agents execute assigned tasks in this model; they do not decide what comes next or where artifacts should live.

The second layer is bounded agent execution. Rather than a single general-purpose agent, the system uses specialized agents for specific roles: requirements analysis, architecture design, coding, and knowledge queries. Each agent operates within explicit guidelines encoded as reusable “skills” — modular instruction sets that function as microservices for AI work. The filesystem itself becomes a machine-readable specification: folder structure and naming conventions define relationships without relying on agent interpretation.

Evaluation and human review

Output validation occurs in two stages. First, deterministic checks (linting, structural validation) run automatically. Then a separate critic agent validates judgment calls. Agents iterate up to five times before the workflow escalates to human review.

The practical result is that a full feature cycle — requirements, architecture, implementation, and tests — can be generated on a single branch and reviewed by a human as a complete pull request. Teams that adopt this approach can run multiple feature experiments per day, which the authors position as a viable form of waterfall done at agent speed.

What it requires organizationally

The deeper point the article makes is organizational: this pattern requires that decisions be documented rather than implicit in individual expertise or Slack conversations. Institutional knowledge that lives in people’s heads cannot be encoded as agent skills. For product managers working with engineering teams building AI agent systems, the framework offers a concrete vocabulary for discussing where human judgment is necessary and where it is not — a distinction that matters more as autonomous execution increases.