The phrase 'AI agents' gets used to describe anything from a chatbot with a system prompt to a fully autonomous workflow that books your flights. Most of what gets called an agent is just a script with extra steps. A real orchestrator + sub-agent architecture is a specific shape, and once you see it, you stop confusing it with everything else.
The shape, in one diagram-free paragraph
A single language model handles the planning. We call that the orchestrator. The orchestrator does not do the work itself. It decides which sub-agents to dispatch, what context to give each one, and how to combine their outputs. Each sub-agent has a narrow tool registry, an isolated context window, and a single job. The orchestrator collects, verifies, and synthesizes. That is the whole pattern.
Why not just one big model call?
Context windows have a sweet spot. A model running a single huge prompt with twelve tool calls and a thirty-thousand-token context tends to drift, hallucinate, and lose track of the user's actual goal halfway through. Fanning out to sub-agents with clean contexts solves three problems at once: each agent runs faster on a smaller prompt, errors stay localized to the sub-task that produced them, and parallelization gets you significant wall-clock improvements on independent work.
If your agent needs to read three thousand lines of code to answer a yes-or-no question, it is going to get the answer wrong.
Four things most teams get wrong
- 01Treating the orchestrator like a worker. The orchestrator should plan and merge, not execute. Letting it run tools directly defeats the point of having sub-agents.
- 02Giving every sub-agent the same tool registry. The whole reason to use sub-agents is least-privilege scoping. A sub-agent that summarizes log files should not have the ability to delete files.
- 03Skipping verification. The pattern that survives production is: sub-agent produces a finding, a separate verifier sub-agent tries to refute it, only confirmed findings get synthesized. Without that step, you ship plausible-sounding garbage.
- 04Not bounding cost. Every sub-agent dispatch should have a hard token budget. Without it, one runaway loop costs more than a developer's annual salary.
When the pattern is worth the complexity
Orchestrator + sub-agent is overkill for most tasks. If the work fits in a single context window with predictable steps, use a single-prompt model. The pattern earns its complexity when one of these is true: the task fans out into many independent items (review every file in this PR, score every ad creative in the batch), the task has stages that depend on the previous stage's output (search, then verify, then synthesize), or the cost of a wrong answer is high enough to justify adversarial verification.
What we run internally
Our own internal stack is exactly this shape. An orchestrator routes incoming tasks, sub-agents handle classified work in parallel, a verifier layer catches the plausible-but-wrong outputs, and a synthesis pass produces the final structured payload back to the operator. It compresses what used to be days of senior work into hours, without losing the judgment layer that humans still own. That is the trade we built it for.