Agent orchestration patterns

I have shipped four orchestration patterns in production over the past year. Each one solves a different problem. None of them is the right answer for every job.

Single agent: one model call, one response. No orchestration. Use when the task fits in one prompt and the model is smart enough. 90% of "we need an agent" problems are this with extra steps.

Sequential pipeline: model A produces, model B reviews, model C polishes. State flows linearly. Use when each stage has a different objective and you want each model to specialize. Trade-off: latency stacks. Three stages of 3s each is 9s total. Users feel that.

Parallel fleet: N models tackle the same problem from different angles, results merge. Use when diversity of perspective beats single-model depth. agent-fleet ships this for code review (analyst + critic + research, all see the same diff, write three reports, a synthesis pass merges). Cost stacks but latency stays single-stage. Trade-off: synthesis quality matters more than individual quality.

Judge-arbitrated: candidates produce, a judge ranks. Use when "best of N" is the right framing. darwin-agents ships this for prompt evolution, the judge is a different model than the producers. Trade-off: judge bias becomes a hidden axis. If your judge is GPT-5 and your producers are Claude, you get GPT-5's preferences.

The mistake I keep seeing in agent-startup demos: parallel fleet on a problem that wanted single agent. Three models giving slightly different answers to "summarize this article" is not better than one model. It is worse, because now you need to merge them and the merge introduces noise.

Cost-of-orchestration heuristic: each additional model adds 2x to the bill, 1.5x to the latency, and a new failure mode. If the simpler version works, use the simpler version. Add complexity when you can name the specific gain.

The pattern I trust most for new problems: start single agent, measure where it fails, add the minimum orchestration that addresses the failure mode. Most projects stay at single agent or sequential pipeline. Parallel fleet is for code review, multi-perspective research, debate-shaped problems. Judge-arbitrated is for prompt-quality optimization and content-grading.