multi-agent-system-design

Learning Objectives

By the end of this lesson, you will be able to:

  • Define a multi-agent system in the Claude context
  • Explain why multi-agent architectures are used instead of a single agent
  • Describe the three communication patterns between agents
  • Identify design considerations for reliability and coordination

Multi-Agent System Design

What a Multi-Agent System Is

A multi-agent system consists of multiple Claude instances — or Claude operating alongside other models — each assigned a defined role, communicating with one another to accomplish a goal that no single agent could reliably handle alone.

Each agent in the system has its own context window, its own system prompt, and its own set of available tools. Agents are not aware of one another’s internal reasoning. They communicate through outputs: one agent produces a structured result, and that result becomes the input another agent observes. The system as a whole behaves intelligently because each agent’s contribution is scoped, purposeful, and coordinated.

This architecture is distinct from a single-agent loop. In a single-agent system, one Claude instance handles all phases of a task — observation, reasoning, tool use, and response — within a single context window. In a multi-agent system, those responsibilities are distributed. The design question is not whether Claude can handle a task alone, but whether distributing the task across specialised agents produces a more reliable, scalable, or capable result.


Why Multi-Agent Architectures Are Used

Single agents work well for bounded, linear tasks. Multi-agent architectures address the limitations that emerge when tasks grow complex, long-running, or structurally parallel.

Parallelisation is the most direct performance justification. When a task contains independent subtasks — processing multiple documents, querying multiple data sources, evaluating multiple candidate outputs — running those subtasks simultaneously across separate agents reduces total completion time. A single agent would process them sequentially, one at a time, within a shared context window.

Specialisation improves quality. An agent configured specifically for legal document extraction — with a targeted system prompt, a curated tool set, and constrained output format — performs that task more reliably than a general-purpose agent asked to handle extraction alongside ten other responsibilities. Specialisation allows you to tune each agent’s behaviour precisely for its role.

Task decomposition makes complex workflows manageable. A task that requires research, synthesis, formatting, and validation involves distinct cognitive operations. Decomposing that task into discrete agent responsibilities clarifies ownership, simplifies debugging, and makes each component independently testable.

Context window limits are a hard architectural constraint. A single Claude instance has a finite context window. Long-running workflows that accumulate large volumes of tool results, intermediate outputs, and conversation history will eventually exhaust that window. Multi-agent systems sidestep this constraint by distributing context across agents, each of which maintains only the information relevant to its role.


Three Communication Patterns Between Agents

The way agents communicate defines the structure of the system. There are three primary patterns.

Sequential communication means the output of one agent becomes the input of the next. Agent A completes its task and passes its result to Agent B, which processes that result and passes its own output to Agent C. This pattern is appropriate when each step depends on the completion of the previous step — when there is a strict order of operations. The risk is that errors propagate linearly. If Agent A produces a flawed output, Agent B reasons from that flawed input, and the error compounds downstream.

Parallel communication means multiple agents run simultaneously, each working on an independent subtask. Once all agents have completed their work, their results are merged by an aggregating process or a dedicated agent. This pattern is appropriate when subtasks are genuinely independent and when speed is a priority. The design challenge is result-merging: outputs from parallel agents may be inconsistent, contradictory, or formatted differently. The merging step must be designed to handle these cases explicitly rather than assuming outputs will be compatible.

Hierarchical communication introduces an orchestrator that directs worker agents. The orchestrator receives the original task, decomposes it into subtasks, delegates each subtask to an appropriate worker, collects the results, and synthesises a final output. Workers do not communicate with one another directly — all coordination flows through the orchestrator. This pattern is the most structured of the three and is covered in depth in Lesson 1.4 in the context of hub-and-spoke architecture.


Agent Roles in a Multi-Agent System

Regardless of communication pattern, three functional roles recur across multi-agent architectures.

The orchestrator manages the workflow. It holds the task-level objective, determines which agents to invoke and in what order or configuration, and assembles the final output. The orchestrator does not typically perform deep domain work — its responsibility is coordination.

The worker or specialist agent executes a defined subtask. It receives scoped input, applies its specialised configuration, uses its designated tools, and returns a structured output. A worker agent’s system prompt is narrow by design — it knows what it is responsible for and nothing more.

The validator agent evaluates the output of other agents before that output is passed downstream or returned to the user. It checks for quality, consistency, format compliance, or factual accuracy depending on what the system requires. Validation is not always a separate agent — in simpler systems it may be a step within the orchestrator — but in high-stakes pipelines, an independent validator adds a meaningful layer of error detection.


Design Considerations for Reliability and Coordination

Multi-agent systems introduce coordination complexity that single-agent systems do not have. Several design concerns demand deliberate attention.

How agents pass context determines whether each agent has what it needs to reason correctly. Agents do not share memory. Every piece of information an agent requires must be present in the input it receives. If the orchestrator delegates a subtask without including relevant background, the worker agent will reason from an incomplete picture. Context packaging — deciding what to include in each agent’s input — is a critical design responsibility.

How errors propagate depends on the communication pattern. In sequential systems, an upstream error silently corrupts all downstream reasoning unless explicit validation occurs between steps. In parallel systems, a single failed agent may produce a gap in the merged result. Architects must decide how each agent signals failure, how the orchestrator detects and handles failure signals, and whether failed subtasks trigger retries, fallbacks, or graceful degradation.

Preventing cascade failures requires isolation. If one agent’s failure causes the orchestrator to stall, the entire pipeline halts. Robust designs treat each agent as a bounded component with a defined success and failure interface. The orchestrator handles failure states without depending on agents to recover themselves.


Practical Example: A Document Processing Pipeline

Consider a pipeline designed to process incoming contract documents.

A routing agent receives each document and classifies it by type — non-disclosure agreement, service contract, employment agreement — then routes it to the appropriate specialist.

An extraction agent, configured specifically for the identified document type, reads the document and extracts structured fields: parties, effective dates, termination clauses, liability caps.

A quality-check agent receives the extracted fields and validates them against a set of rules — checking for missing mandatory fields, date logic errors, or values that fall outside acceptable ranges — before passing the validated output to the destination system.

This pipeline uses sequential communication. Each agent’s output is the next agent’s input. Errors surface at the quality-check stage rather than reaching the destination system. Each agent is independently configurable, testable, and replaceable.


A Note on Hub-and-Spoke Architecture

The hierarchical communication pattern — where an orchestrator delegates to and collects from worker agents — is formalised in the hub-and-spoke model. Hub-and-spoke introduces specific conventions for how the orchestrator manages delegation, how workers report results, and how the system handles partial failures. Lesson 1.4 covers hub-and-spoke architecture in full.


Key Takeaways

  • A multi-agent system distributes a complex task across multiple Claude instances, each with a defined role, a scoped system prompt, and its own context window.
  • Multi-agent architectures are justified by parallelisation, specialisation, task decomposition, and the need to work around context window limits.
  • The three communication patterns — sequential, parallel, and hierarchical — each suit different task structures and carry different failure modes.
  • Orchestrators coordinate; workers execute; validators check. These three roles appear consistently across well-designed multi-agent systems.
  • Context packaging, error propagation, and cascade failure prevention are the three design concerns that most directly determine whether a multi-agent system is reliable in production.

What Is Tested

Exam questions on multi-agent system design ask candidates to select the appropriate communication pattern for a described workflow — for example, determining whether a given set of subtasks warrants sequential or parallel execution — and to justify that selection in terms of task dependencies and failure risk. Candidates are also asked to identify agent roles from a system diagram, distinguishing orchestrators from workers and validators. A third category of question targets the design of parallel systems specifically, asking candidates to explain why parallelisation requires explicit result-merging design and what happens when merged outputs are inconsistent or incomplete.