Medium: The product manager stack in 2026 — agents, infrastructure, and evaluation

Keren Koshman’s March 2026 article in Medium’s Bootcamp publication addresses a question product managers increasingly face as AI tooling expands: not which individual tools to try, but how to build a coherent stack that works together. The piece is structured around five agent roles and three infrastructure layers, with a central argument that evaluation — not tooling selection — is where most teams fall short.

What it covers

The article proposes five agent roles that PMs should either have operational or understand well enough to specify for engineering:

A research and discovery agent that monitors competitor activity, social signals, and pricing changes on an ongoing basis without requiring manual tracking
An analytics and insights agent connected to data warehouses that answers business questions in natural language, reducing the dependency on BI teams for ad-hoc queries
A spec and documentation agent that generates PRD drafts from minimal input, replacing blank-page writing time with editing time
A rapid prototyping agent using tools like Cursor or Claude Code to build proof-of-concepts quickly enough that prototyping becomes part of discovery rather than a separate phase
An internal communication agent that drafts status updates, meeting summaries, and stakeholder communications

The infrastructure argument

Koshman places these agents within a three-layer infrastructure framework: a context layer (RAG pipelines and data accessibility), an orchestration layer (managing agent flows, retries, and guardrails), and an evaluation layer. The evaluation layer receives the most emphasis. “Every agent is only as good as its evaluation,” she writes, and without systematic measurement, AI feature quality degrades over time. She argues evaluation should be integrated into CI/CD pipelines the same way automated tests are — not treated as a separate operational concern.

A secondary point is about repository structure: unified code repositories across teams create shared context that makes coding agents substantially more useful, while fragmented repositories force agents to work with partial information.

Who it is useful for

This article is most relevant for product managers who are moving past individual tool experiments and thinking about how to build systems that hold up over time. It is particularly applicable for PMs working on AI features themselves, where the infrastructure choices for internal tooling and product infrastructure often overlap. The evaluation framework is the most actionable part — it gives teams a concrete criterion for assessing whether their AI tools are actually improving or quietly degrading.