Multi-Agent Architecture Patterns
Learn the six foundational multi-agent architecture patterns: Single Agent, Supervisor, Router, Handoff, Blackboard, and Pipeline. Each pattern includes TypeScript code skeletons, real-world use cases, and honest tradeoffs so you can pick the right structure for your system.
Multi-Agent Architecture Patterns
Most people building with AI agents make the same mistake: they treat architecture as an afterthought. They wire up one agent, it works, they add another, and six weeks later they have a mess that nobody can reason about or debug. The patterns on this page exist to prevent that.
We run 15 agents in production at The AI University. We have broken things in almost every way possible. What follows is what we actually use, why we use it, and the tradeoffs you need to understand before you commit to a structure.
There are six patterns worth knowing. Every multi-agent system is either one of these or a composition of two or more. Learn them well enough to recognize which one you are building before you write the first line of code.
1. Single Agent
The simplest pattern is one agent with one job. No orchestration. No delegation. One system prompt, one set of tools, one responsibility.
When to use it
Use a single agent when the task is well-bounded and fits inside a single context window. Content summarization, code review for a single file, answering a customer support ticket — these are single-agent tasks. If you can write down the full scope of the job on a sticky note, you probably do not need multiple agents.
Single agents are also the right starting point before you know whether you need anything more complex. Build it simple first. Add agents only when you hit a concrete limitation: context overflow, tool conflicts, or latency from doing too much sequentially.
Tradeoffs
Pros:
- Zero orchestration overhead
- Easy to debug — one log stream, one prompt to inspect
- Cheap to run and fast to iterate on
- Deterministic routing (there is only one route)
Cons:
- Context window is a hard ceiling on task complexity
- All tools loaded at once — messy system prompts at scale
- Cannot parallelize work
- Specialization is impossible when one agent does everything
Code skeleton
// single-agent.ts
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
async function runAgent(userInput: string): Promise<string> {
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 4096,
system: `You are a support agent for AI University.
Your job is to answer questions about our course catalog.
Use the search_courses tool when you need to look up specific courses.`,
messages: [{ role: "user", content: userInput }],
tools: [searchCoursesTool],
});
return extractText(response);
}
The single agent is the foundation. Every other pattern is built on top of it.
2. Supervisor Pattern
The supervisor pattern has one orchestrator agent that receives the original request, decides how to handle it, and delegates work to specialist agents. The specialists report back. The supervisor synthesizes the results and returns a final answer.
User Request
|
v
[Supervisor Agent]
/ | \
v v v
[A] [B] [C]
Specialist Agents
|
v
[Supervisor Agent]
|
v
Final Response
When to use it
Use the supervisor pattern when your task requires multiple areas of expertise that would conflict in a single system prompt, or when you need to coordinate parallel work across agents that have no direct dependency on each other.
This is the pattern we use at The AI University for our full 15-agent growth system. A master orchestrator receives signals — a new lead, a content gap, an engagement drop — and routes that work to the appropriate specialist. The lead scoring agent, the content strategy agent, the email sequencing agent — they each have a focused job. The supervisor knows when to call them and how to stitch their outputs together. No specialist needs to know the others exist.
The supervisor pattern scales well because you can add new specialists without touching existing ones. The orchestrator is the only thing that needs to know the full map.
Tradeoffs
Pros:
- Clean separation of expertise
- Specialists can run in parallel
- Easy to add new capabilities without refactoring
- Single point of coordination — easier to monitor
Cons:
- Supervisor can become a bottleneck
- Supervisor failures take down the whole system
- Latency from extra round-trips through the orchestrator
- Supervisor needs enough context to delegate intelligently — its prompt gets complex
Code skeleton
// supervisor-pattern.ts
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
interface AgentResult {
agentName: string;
output: string;
}
// Specialist agents — each has a narrow focus
async function runLeadScoringAgent(lead: Lead): Promise<AgentResult> {
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 1024,
system: `You are a lead scoring specialist.
Score leads based on engagement, firmographics, and behavioral signals.
Return a JSON object with score (0-100) and reasoning.`,
messages: [{ role: "user", content: JSON.stringify(lead) }],
tools: [getCRMDataTool, getEngagementHistoryTool],
});
return { agentName: "lead-scoring", output: extractText(response) };
}
async function runContentAgent(context: ContentContext): Promise<AgentResult> {
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 2048,
system: `You are a content strategy specialist.
Identify content gaps and recommend topics based on SEO data and user behavior.`,
messages: [{ role: "user", content: JSON.stringify(context) }],
tools: [searchKeywordsTool, analyzeCompetitorsTool],
});
return { agentName: "content-strategy", output: extractText(response) };
}
// Supervisor — knows the full picture, delegates to specialists
async function runSupervisor(request: OrchestrationRequest): Promise<string> {
// Step 1: Supervisor decides what work needs to happen
const planResponse = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 1024,
system: `You are the AI University growth orchestrator.
You coordinate specialist agents to handle growth tasks.
Given the input, decide which agents to invoke and in what order.
Return a JSON plan: { agents: string[], parallel: boolean }`,
messages: [{ role: "user", content: JSON.stringify(request) }],
});
const plan = JSON.parse(extractText(planResponse));
// Step 2: Execute specialist agents (parallel or sequential per plan)
let results: AgentResult[];
if (plan.parallel) {
results = await Promise.all(
plan.agents.map((agent: string) => dispatchAgent(agent, request))
);
} else {
results = [];
for (const agent of plan.agents) {
const result = await dispatchAgent(agent, request);
results.push(result);
}
}
// Step 3: Supervisor synthesizes final response
const synthesisResponse = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 2048,
system: `You are the AI University growth orchestrator.
Synthesize the results from your specialist agents into a final action plan.`,
messages: [
{
role: "user",
content: `Original request: ${JSON.stringify(request)}\n\nSpecialist results: ${JSON.stringify(results)}`,
},
],
});
return extractText(synthesisResponse);
}
function dispatchAgent(
agentName: string,
request: OrchestrationRequest
): Promise<AgentResult> {
switch (agentName) {
case "lead-scoring":
return runLeadScoringAgent(request.lead);
case "content-strategy":
return runContentAgent(request.contentContext);
default:
throw new Error(`Unknown agent: ${agentName}`);
}
}
3. Router Pattern
The router pattern is a lightweight dispatcher. An incoming request is classified, and control is handed off to the agent best suited to handle it. Unlike the supervisor pattern, the router does not synthesize results. It gets out of the way.
Incoming Request
|
v
[Router Agent]
/ | \
v v v
[A] [B] [C]
Billing Tech Sales
Agent Support Agent
When to use it
Use the router when you have clearly distinct request types that need different handling, and where you do not need to combine the outputs. Customer support is the canonical example: a billing question goes to one agent, a technical question goes to another, a refund request goes to a third. The router's only job is classification.
The router is also useful as the front door to a more complex system. In our growth system, the first thing that happens when data arrives is a routing decision: is this a lead event, a content signal, or an operational alert? Each path from there is a separate workflow.
Tradeoffs
Pros:
- Extremely fast — minimal LLM overhead on the routing step
- Specialists are completely isolated from each other
- Easy to add new routes without modifying existing agents
- Routing logic is inspectable and testable independently
Cons:
- Hard to handle requests that span multiple categories
- Routing errors are consequential — misclassification means the wrong agent handles it
- No synthesis step — if you need combined output, this is not the right pattern
- Router can become a classification nightmare as categories proliferate
Code skeleton
// router-pattern.ts
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
type RouteTarget = "billing" | "technical-support" | "sales" | "general";
interface RoutingDecision {
target: RouteTarget;
confidence: number;
reasoning: string;
}
async function routeRequest(userMessage: string): Promise<string> {
// Step 1: Classify the request
const routingResponse = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 256,
system: `You are a request router. Classify incoming messages into one of these categories:
- billing: payment, subscription, invoice, refund questions
- technical-support: bugs, errors, integration, API questions
- sales: pricing, demos, enterprise, partnership questions
- general: everything else
Return JSON: { target: string, confidence: number, reasoning: string }`,
messages: [{ role: "user", content: userMessage }],
});
const decision: RoutingDecision = JSON.parse(
extractText(routingResponse)
);
// Step 2: Dispatch to the correct specialist
return dispatchToAgent(decision.target, userMessage);
}
async function dispatchToAgent(
target: RouteTarget,
message: string
): Promise<string> {
const agentConfigs: Record<RouteTarget, { system: string; tools: any[] }> = {
billing: {
system: `You are a billing support specialist. You have access to subscription and payment records.`,
tools: [getSubscriptionTool, processRefundTool],
},
"technical-support": {
system: `You are a technical support engineer. You debug issues with our platform and APIs.`,
tools: [searchDocsTool, getErrorLogsTool],
},
sales: {
system: `You are a sales development representative. You qualify leads and book demos.`,
tools: [getCRMTool, bookDemoTool],
},
general: {
system: `You are a helpful AI University assistant.`,
tools: [],
},
};
const config = agentConfigs[target];
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 2048,
system: config.system,
messages: [{ role: "user", content: message }],
tools: config.tools,
});
return extractText(response);
}
4. Handoff Pattern
In the handoff pattern, agents pass work to each other in sequence. Agent A does its job and hands off to Agent B with context about what it did. Agent B hands off to Agent C. Each agent in the chain knows what came before it and what it needs to produce for the next step.
Input
|
v
[Agent A] --context--> [Agent B] --context--> [Agent C]
|
v
Final Output
When to use it
Use the handoff pattern for workflows where each step genuinely depends on the completion of the previous one, and where different steps require different expertise. Research then write then review is a classic handoff chain. In growth systems, a handoff might look like: enrich lead data, then score the lead, then generate a personalized outreach sequence.
The key distinction from a pipeline (covered next) is that in handoffs, agents are aware of each other. They receive explicit context from the prior agent and can reason about it. A pipeline is more mechanical — each node just transforms its input and passes it forward without awareness of the chain.
Tradeoffs
Pros:
- Natural fit for sequential, dependent workflows
- Each agent can build on and critique prior work
- Context accumulates through the chain
- Failures are localizable — you know exactly which step broke
Cons:
- Sequential by nature — no parallelism
- Errors early in the chain corrupt everything downstream
- Context windows fill up as the handoff object grows
- Harder to retry individual steps without re-running the whole chain
Code skeleton
// handoff-pattern.ts
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
interface HandoffContext {
originalInput: string;
steps: Array<{
agentName: string;
output: string;
metadata: Record<string, unknown>;
}>;
}
async function runResearchAgent(
context: HandoffContext
): Promise<HandoffContext> {
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 2048,
system: `You are a research specialist. Gather and synthesize information on the given topic.
Return structured research findings that a writer can use directly.`,
messages: [
{
role: "user",
content: `Topic: ${context.originalInput}`,
},
],
tools: [webSearchTool, academicSearchTool],
});
return {
...context,
steps: [
...context.steps,
{
agentName: "research",
output: extractText(response),
metadata: { toolsUsed: getToolsUsed(response) },
},
],
};
}
async function runWriterAgent(
context: HandoffContext
): Promise<HandoffContext> {
const researchStep = context.steps.find((s) => s.agentName === "research");
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 4096,
system: `You are a content writer. Take research findings and write a clear, engaging article.
Do not add facts not present in the research. Maintain accuracy.`,
messages: [
{
role: "user",
content: `Original topic: ${context.originalInput}
Research findings:
${researchStep?.output}
Write the article now.`,
},
],
});
return {
...context,
steps: [
...context.steps,
{
agentName: "writer",
output: extractText(response),
metadata: { wordCount: countWords(extractText(response)) },
},
],
};
}
async function runEditorAgent(
context: HandoffContext
): Promise<HandoffContext> {
const writerStep = context.steps.find((s) => s.agentName === "writer");
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 4096,
system: `You are a senior editor. Review the draft for clarity, accuracy, and structure.
Make direct edits — return the final polished version, not a list of suggestions.`,
messages: [
{
role: "user",
content: `Draft article:\n${writerStep?.output}`,
},
],
});
return {
...context,
steps: [
...context.steps,
{
agentName: "editor",
output: extractText(response),
metadata: { revisionsCount: 1 },
},
],
};
}
// Orchestrate the handoff chain
async function runContentWorkflow(topic: string): Promise<string> {
let context: HandoffContext = { originalInput: topic, steps: [] };
context = await runResearchAgent(context);
context = await runWriterAgent(context);
context = await runEditorAgent(context);
const finalStep = context.steps[context.steps.length - 1];
return finalStep.output;
}
5. Blackboard Pattern
The blackboard pattern gives multiple agents access to a shared memory store — the blackboard. Agents read from it, do their work, and write results back. No agent directly calls another. They coordinate through the shared state.
[Blackboard / Shared State]
/ | \
v v v
[Agent A] [Agent B] [Agent C]
reads & reads & reads &
writes writes writes
When to use it
Use the blackboard when you have agents that need to react to each other's outputs without tight coupling. It is well-suited for monitoring and alerting systems, collaborative analysis where agents each contribute a perspective, or any scenario where the order of agent execution is emergent rather than predetermined.
In an analytics system, you might have a trend detection agent, an anomaly detection agent, and a reporting agent all writing observations to a shared store. A fourth agent watches the blackboard and triggers alerts when multiple agents flag the same data point simultaneously. No agent needs to call another — they all just read and write.
Tradeoffs
Pros:
- Loose coupling — agents are independent and replaceable
- Easy to add new agents without changing existing ones
- Supports emergent coordination — the whole can be smarter than the parts
- Works well for event-driven and asynchronous systems
Cons:
- Harder to reason about causality — who wrote what and when
- Race conditions if multiple agents write conflicting data simultaneously
- Debugging is harder — no single execution path to trace
- Requires a robust shared store with proper locking or versioning
Code skeleton
// blackboard-pattern.ts
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
// The blackboard — in production this would be a database or Redis
interface BlackboardEntry {
agentName: string;
timestamp: number;
key: string;
value: unknown;
}
class Blackboard {
private store: BlackboardEntry[] = [];
write(agentName: string, key: string, value: unknown): void {
this.store.push({
agentName,
timestamp: Date.now(),
key,
value,
});
}
read(key: string): BlackboardEntry | undefined {
return this.store
.filter((e) => e.key === key)
.sort((a, b) => b.timestamp - a.timestamp)[0];
}
readAll(): BlackboardEntry[] {
return [...this.store];
}
readByAgent(agentName: string): BlackboardEntry[] {
return this.store.filter((e) => e.agentName === agentName);
}
}
const blackboard = new Blackboard();
async function runTrendAgent(data: AnalyticsData): Promise<void> {
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 1024,
system: `You are a trend detection agent. Identify trends in the data.
Return JSON: { trends: Array<{ metric: string, direction: string, magnitude: number }> }`,
messages: [{ role: "user", content: JSON.stringify(data) }],
});
const trends = JSON.parse(extractText(response));
blackboard.write("trend-agent", "trends", trends);
}
async function runAnomalyAgent(data: AnalyticsData): Promise<void> {
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 1024,
system: `You are an anomaly detection agent. Find statistical outliers.
Return JSON: { anomalies: Array<{ metric: string, value: number, zScore: number }> }`,
messages: [{ role: "user", content: JSON.stringify(data) }],
});
const anomalies = JSON.parse(extractText(response));
blackboard.write("anomaly-agent", "anomalies", anomalies);
}
async function runReportingAgent(): Promise<string> {
// Read everything on the blackboard
const trends = blackboard.read("trends")?.value;
const anomalies = blackboard.read("anomalies")?.value;
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 2048,
system: `You are a reporting agent. Synthesize findings into a concise executive summary.`,
messages: [
{
role: "user",
content: `Trends: ${JSON.stringify(trends)}\nAnomalies: ${JSON.stringify(anomalies)}`,
},
],
});
const report = extractText(response);
blackboard.write("reporting-agent", "report", report);
return report;
}
// Run agents (can be parallel — they coordinate via blackboard)
async function runAnalysisSystem(data: AnalyticsData): Promise<string> {
await Promise.all([runTrendAgent(data), runAnomalyAgent(data)]);
return runReportingAgent();
}
6. Pipeline Pattern
The pipeline pattern chains agents mechanically. Each agent receives the output of the previous one, transforms it, and passes the result forward. Unlike the handoff pattern, pipeline agents do not receive accumulated context or reason about the prior steps — they only see their immediate input.
Raw Input
|
v
[Agent A: Extract]
|
v
[Agent B: Enrich]
|
v
[Agent C: Transform]
|
v
[Agent D: Format]
|
v
Final Output
When to use it
Use pipelines for data processing workflows where each step has a clear, narrow transformation to perform. ETL-style workloads are the canonical use case: extract data from a source, clean it, enrich it, and format it for a destination. The pipeline pattern is also appropriate when you want predictable, testable transformations — each stage can be tested in isolation with fixed inputs and outputs.
In content workflows, a pipeline might: extract raw transcript text, clean and segment it, generate timestamps and summaries for each segment, then format the whole thing as structured JSON for a CMS.
Tradeoffs
Pros:
- Each stage is independently testable
- Easy to parallelize across multiple inputs
- Simple to add, remove, or reorder stages
- Predictable data flow — easy to monitor and log
Cons:
- No feedback between stages — later stages cannot inform earlier ones
- Errors at any stage break the whole pipeline
- Not suitable for tasks requiring iteration or back-and-forth reasoning
- Each agent operates without awareness of the broader goal
Code skeleton
// pipeline-pattern.ts
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
type PipelineStage<TIn, TOut> = (input: TIn) => Promise<TOut>;
// Generic pipeline runner
async function runPipeline<T>(
input: T,
stages: Array<PipelineStage<unknown, unknown>>
): Promise<unknown> {
let current: unknown = input;
for (const stage of stages) {
current = await stage(current);
}
return current;
}
// Define individual pipeline stages
const extractStage: PipelineStage<RawTranscript, ExtractedText> = async (
raw
) => {
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 4096,
system: `Extract clean text from the raw transcript. Remove filler words, fix obvious transcription errors.
Return JSON: { text: string, duration: number, speaker: string }`,
messages: [{ role: "user", content: JSON.stringify(raw) }],
});
return JSON.parse(extractText(response));
};
const enrichStage: PipelineStage<ExtractedText, EnrichedContent> = async (
extracted
) => {
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 2048,
system: `Add metadata to the extracted content: topic classification, key entities, sentiment.
Return JSON with original fields plus: { topic: string, entities: string[], sentiment: string }`,
messages: [{ role: "user", content: JSON.stringify(extracted) }],
});
return JSON.parse(extractText(response));
};
const formatStage: PipelineStage<EnrichedContent, CMSPayload> = async (
enriched
) => {
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 2048,
system: `Format the enriched content as a CMS-ready payload.
Return JSON matching the CMS schema: { title, body, tags, meta }`,
messages: [{ role: "user", content: JSON.stringify(enriched) }],
});
return JSON.parse(extractText(response));
};
// Run the content processing pipeline
async function processTranscript(raw: RawTranscript): Promise<CMSPayload> {
const result = await runPipeline(raw, [
extractStage as PipelineStage<unknown, unknown>,
enrichStage as PipelineStage<unknown, unknown>,
formatStage as PipelineStage<unknown, unknown>,
]);
return result as CMSPayload;
}
Choosing the Right Pattern
Use this table as a first pass. The right pattern usually becomes obvious once you have clearly stated what the inputs and outputs are and how the work depends on itself.
| Pattern | Best For | Avoid When | Parallelizable | Context Sharing |
|---|---|---|---|---|
| Single Agent | Simple, bounded tasks | Task exceeds context window | No | N/A |
| Supervisor | Multi-domain tasks requiring synthesis | You do not need synthesis | Yes (per supervisor) | Through supervisor |
| Router | Clearly distinct request types | Requests span multiple categories | Yes (per route) | None between agents |
| Handoff | Sequential, dependent steps | Parallelism is required | No | Explicit, accumulated |
| Blackboard | Event-driven, loosely coupled systems | Strong ordering is required | Yes | Shared store |
| Pipeline | Data transformation chains | Stages need to iterate on each other | Yes (across inputs) | None between stages |
Decision questions
Work through these in order:
-
Can one agent do the whole job? If yes, use a single agent. Do not over-engineer.
-
Do the steps depend on each other sequentially? If yes and the steps require different expertise, use the handoff pattern. If the steps are mechanical transformations, use a pipeline.
-
Are there clearly distinct request types with no cross-over? If yes, use a router.
-
Do multiple agents need to contribute to a single answer? If yes, use the supervisor pattern.
-
Do agents need to react to each other without being directly coupled? If yes, use the blackboard pattern.
Most production systems use two or three of these patterns in combination. Our 15-agent growth system at The AI University uses a supervisor at the top level, router decisions inside the supervisor's delegation logic, and pipeline processing for data ingestion. Each pattern handles the layer it is best suited for.
Composing Patterns
Patterns are not mutually exclusive. A supervisor can delegate to a pipeline. A router can send traffic to a handoff chain. A blackboard can feed inputs to a supervisor. The rule is simple: apply each pattern at the layer where it fits, and make sure the seam between layers is clean and inspectable.
The failure mode to avoid is mixing patterns within a single agent boundary. An agent should have one job and one coordination role. If your agent is routing AND synthesizing AND processing in a pipeline, you have collapsed multiple layers into one, and debugging it will be painful.
What We Run at The AI University
Our production system is a supervisor pattern with 15 specialists. The supervisor — we call it the growth orchestrator — receives signals from our data layer and decides which specialists to invoke. Here is the current roster, organized by domain:
Lead and revenue agents: lead-scoring, lead-enrichment, outreach-sequencing, churn-prediction
Content and SEO agents: content-gap-analysis, keyword-clustering, content-brief-generation, seo-audit
Engagement and retention agents: email-send-timing, engagement-scoring, re-engagement-trigger
Operational agents: data-quality-monitor, anomaly-alert, reporting-synthesis, campaign-performance
The orchestrator runs on a schedule and event-driven triggers. When a new lead comes in, it invokes lead-enrichment and lead-scoring in parallel, waits for both, then invokes outreach-sequencing with their combined output. That three-agent sequence is a handoff pattern embedded inside the supervisor pattern.
The whole system runs via the Claude CLI with claude -p, which means no per-token API costs. At 15 agents running on Max subscription, the economics are fundamentally different from calling the API directly. Architecture decisions — especially around how often agents run and how much they parallelize — have real cost implications. Design accordingly.
Further Reading
The patterns on this page are starting points. Real systems require you to think about error handling, retry logic, observability, and how you store and pass context between agents. Each of those topics deserves its own treatment, and we cover them in the guides linked below.
Start with the pattern that fits your immediate problem. Build the simplest version that works. You will know when you need to evolve it.