AI Coach

What this lesson is about

Multi-agent orchestration is the ability to set one Claude working as a manager — breaking a large task into pieces, delegating each piece to a separate Claude worker, and then assembling the results — so that complex work happens in parallel rather than one step at a time. This lesson explains how that system works in plain English, when it is genuinely useful, and what you need to know to design tasks that delegate well.

Core concept: the manager and the team

Imagine you need to get three things done before a client meeting tomorrow: research the client’s competitors, write a summary report, and proofread that report before it goes out. If you do all three yourself, in sequence, it takes the full day. But if you manage a small team — one person on research, one on writing, one on proofreading, all working at the same time — the work is done in a fraction of the time.

Multi-agent orchestration works on exactly this principle. One Claude acts as the orchestrator (the manager): it receives your overall goal, breaks it into distinct tasks, and sends each task to a separate Claude subagent (a worker). Each subagent focuses on its one job, completes it, and returns the result to the orchestrator, which reviews and assembles everything into the final output.

You give the instruction once, to the manager. The manager handles the rest.

Why parallel agents are faster

When one Claude handles a large, multi-step task by itself, it works sequentially — step one, then step two, then step three, each waiting for the previous one to finish. This is fine for smaller tasks, but for complex projects it means the total time is the sum of every step.

Think of it like running three errands on a Saturday morning. If you do them yourself — post office, then pharmacy, then grocery store — it might take two hours. If you send three family members out at the same time, each handling one errand, all three are done in forty minutes.

Parallel subagents compress time the same way. While one Claude is researching, another is already drafting. While that one is drafting, a third is setting up the next stage. The total wall-clock time shrinks significantly for tasks that can be split into independent pieces.

How the Task tool works

The orchestrator’s role

The Task tool is what allows the orchestrator Claude to spawn subagents. When you give the orchestrator an instruction, it does not do all the work itself. Instead, it:

Analyses your goal and identifies the distinct subtasks
Uses the Task tool to launch a separate subagent for each subtask
Passes each subagent everything it needs — context, instructions, constraints — in one complete briefing
Waits for each subagent to return its result
Reviews, combines, and refines the results into the final output

The orchestrator is a coordinator, not a worker. Its job is thinking clearly about structure and quality, not doing the execution itself.

What a subagent receives

Each subagent starts as a completely blank slate. It has no memory of the parent session, no knowledge of what other subagents are doing, and no access to any files or context unless the orchestrator explicitly sends them. This is the most important rule in multi-agent work:

Subagents know only what you tell them. If it is not in the briefing, it does not exist for them.

This means a well-written subagent instruction must be entirely self-contained — it should include the background, the goal, the constraints, the format of the output expected, and any relevant content the subagent will need to work with.

Here is an example of a well-briefed subagent instruction, the kind the orchestrator would send:

You are working on a project for Thandi's Ceramics, a handmade ceramic
homewares business based in Cape Town. The brand tone is warm and artisan —
never corporate. All copy uses South African English.

Your task: Write three product descriptions for the items listed below.
Each description should be 40–60 words, written in the second person
("You'll love..."), and end with a gentle call to action.

Products to describe:
1. Speckled terracotta mug, 350ml, R285
2. Sage green serving bowl, 28cm diameter, R520
3. Set of four side plates in off-white, R680

Return only the three descriptions, clearly labelled by product name.
Do not include any preamble or sign-off.

Notice what this briefing includes: who the client is, what the tone is, what language rules apply, the exact task, the format requirements, the word count, and the raw material to work with. A subagent receiving this briefing has everything it needs and nothing is left to assumption.

Real workflow example: research, write, proofread

Here is how a three-agent pipeline might handle the production of a competitive analysis report.

YOUR INSTRUCTION TO THE ORCHESTRATOR
─────────────────────────────────────
"Produce a two-page competitive analysis for Thandi's Ceramics,
comparing us to three local competitors. Research each competitor,
write the report, and proofread it before returning the final version."


ORCHESTRATOR
─────────────────────────────────────
Breaks the task into three sequential subagent jobs:


SUBAGENT 1 — Researcher
─────────────────────────────────────
Task: Gather key facts about Competitors A, B, and C —
      pricing, product range, tone, online presence.
Input: Competitor names and website URLs.
Output: A structured set of notes, one section per competitor.


        ↓ Research notes passed to next subagent


SUBAGENT 2 — Writer
─────────────────────────────────────
Task: Write the two-page report using the research notes provided.
Input: Research notes from Subagent 1 + brand brief.
Output: A complete draft report in the required format.


        ↓ Draft passed to next subagent


SUBAGENT 3 — Proofreader
─────────────────────────────────────
Task: Proofread the draft for errors, inconsistencies,
      and tone — flag or correct any issues found.
Input: Draft report from Subagent 2 + brand tone guidelines.
Output: Final, corrected report ready to send.


        ↓ Final report returned to orchestrator


ORCHESTRATOR
─────────────────────────────────────
Reviews the final output, confirms it meets the original brief,
and returns it to you.

Each subagent does one job well. No single Claude is trying to hold research notes, a half-written report, and proofreading criteria in mind simultaneously — a situation that, in a very long single session, can lead to errors and inconsistencies. Separation of concerns produces cleaner output.

The cost implications

Every Claude session — whether it is an orchestrator or a subagent — consumes tokens (the unit used to measure text processed, roughly three-quarters of a word). In a single-agent session, you pay for one conversation. In a multi-agent session, you pay for each agent’s conversation separately.

Here is a simple example:

Approach	Tokens used (approximate)	Relative cost
Single agent handles research, writing, and proofreading	12 000 tokens	1×
Orchestrator + 3 subagents handling the same task	18 000–22 000 tokens	1.5–2×

The higher cost comes from two sources: the orchestrator’s own work (analysing the task, writing briefings, reviewing results), and the fact that each subagent must receive full context in its briefing — context that would only appear once in a single-agent session.

For small or straightforward tasks, this overhead is not worth it. For large, complex, or time-sensitive tasks, the speed gain and quality improvement can justify the additional cost many times over.

When multi-agent is worth it — and when it is overkill

Situation	Best approach	Why
Writing a single short document	Single agent	Overhead of orchestration exceeds any benefit
Researching and writing a long report	Multi-agent	Research and writing can be separated cleanly
Editing a document you have already written	Single agent	One focused task — no parallelism needed
Processing a large batch of similar items (e.g. 50 product descriptions)	Multi-agent	Batches split naturally; parallel processing saves significant time
A task with steps that depend strictly on each other	Single agent	Sequential dependency removes the parallelism benefit
A task where steps can run simultaneously	Multi-agent	Parallel execution is the core advantage
You need the work done quickly and cost is secondary	Multi-agent	Speed is where multi-agent excels
You want the lowest possible token cost	Single agent	Fewer agents means fewer tokens
Quality control matters and you want a dedicated review step	Multi-agent	A separate proofreader subagent adds a genuine second pass

The clearest signal that multi-agent is appropriate: can the task be broken into distinct pieces where each piece does not need to know what the others are doing while it works? If yes, it is a strong candidate for delegation.

How to design a task suitable for delegation

Not every task splits well into subagent work. A task that is suitable for delegation has these characteristics:

It has a clear, bounded output

A good subagent task produces something specific and measurable — a set of product descriptions, a competitor research summary, a proofread draft. Vague tasks (“help with the website”) do not delegate well because there is no clear point at which the subagent knows it is done.

It can be fully briefed without a back-and-forth conversation

Because subagents cannot ask follow-up questions mid-task, everything they need must be in the initial briefing. If a task would normally require several rounds of clarification before the work could begin, it is not ready to delegate — clarify it with the orchestrator first, then delegate.

It does not depend on the real-time output of another task still in progress

If Subagent 2 needs to wait for Subagent 1 to finish before it can begin, they cannot run in parallel — they must run in sequence. This is fine, but it removes the speed advantage of parallelism. True parallel subagents work on independent pieces simultaneously.

The handover points are clean

The output of one subagent becomes the input of the next. If that handover is clear — research notes go to the writer, draft goes to the proofreader — the pipeline runs smoothly. If the output of one stage is ambiguous or incomplete, the next stage inherits those problems.

A useful test: could you write a complete briefing document for this subagent right now, with no further thought, that would allow someone who knows nothing about the project to do the job correctly? If yes, it is ready to delegate. If not, the task needs more definition before it goes to an agent.

Practical Exercise

In this exercise you will design a three-subagent pipeline for a real task and write the briefing for each subagent — without necessarily running it yet. The goal is to build the skill of thinking clearly about task decomposition and delegation.

a. Choose a real piece of work you need to do that involves at least three distinct steps. It might be researching a supplier, writing a proposal, or producing a set of product descriptions. Describe the overall task to Claude in a new session:

I want to design a multi-agent pipeline for the following task:

[describe your task in 2–3 sentences]

Please break this into 2–4 subagent tasks, identify what each subagent needs as input and what it should return as output, and tell me whether any of them can run in parallel or whether they must run in sequence.

Review Claude’s proposed breakdown. Does each piece have a clear, bounded output? Are the handover points between stages clean? Adjust the breakdown if anything feels vague.

b. Ask Claude to write a complete subagent briefing for the first stage — the kind of self-contained instruction a subagent would receive:

Please write the full subagent briefing for Stage 1 of this pipeline.
It should be completely self-contained — include all context, the exact
task, the output format required, and any constraints. Assume the subagent
has no knowledge of this project beyond what you write in the briefing.

Read the briefing critically. Is there anything a subagent would need that is not included? Would a person with no prior knowledge of your project be able to complete the task from this briefing alone?

c. Estimate the cost and time trade-off:

Compare the multi-agent approach we've designed against a single agent
doing the same work from start to finish. Estimate roughly how many more
tokens the multi-agent approach uses, and describe what we gain in return
for that additional cost in this specific case.

Use the response to decide whether the multi-agent approach is justified for this particular task, or whether a single focused session would serve you just as well.

Common problems and how to fix them

A subagent produced output that ignored the brief

This almost always means the briefing was incomplete or ambiguous in one specific area. Read the subagent’s output and identify exactly where it went wrong — what assumption did it make that you did not intend? Then rewrite the briefing to address that gap explicitly. A briefing that anticipates and rules out the wrong interpretation is a stronger briefing.

The subagents are producing inconsistent results with each other

This happens when subagents are not given the same core context — for example, one is told the brand tone and another is not, or one uses a different product name format than the others. Every subagent in a pipeline should receive the same baseline brief (brand guidelines, terminology, format rules) even if their individual tasks are different. Keep this baseline brief as a block of text you paste into every subagent instruction.

The orchestrator is doing the work itself instead of delegating

If Claude is completing the task in one session rather than spawning subagents, it may not have been given clear instruction to use the Task tool and delegate. Be explicit: “Please use the Task tool to delegate this to subagents rather than completing it yourself.” In complex pipelines, describing the intended structure upfront — orchestrator, then Subagent 1, 2, 3 — helps Claude understand the architecture you want.

The cost is much higher than expected

Multi-agent sessions consume tokens quickly, especially if the orchestrator writes long briefings or if subagents are given very large amounts of context to work with. Trim briefings to include only what is genuinely necessary — every word in a subagent briefing is a token. If cost is a concern, consider whether the task genuinely benefits from multiple agents or whether a single well-structured session would serve equally well.

One subagent’s output is good but the next subagent ignores parts of it

When one subagent’s output is passed to the next, the orchestrator should highlight the specific parts the next subagent needs to focus on. Passing an entire research document to a writer without guidance can result in important points being missed. Ask the orchestrator to extract and summarise the key points from each stage before passing them forward, rather than passing raw output wholesale.

What you have learned in this lesson

Multi-agent orchestration means one Claude acts as a manager (orchestrator), delegating specific tasks to separate Claude workers (subagents), rather than doing all the work itself in a single session
Parallel subagents can complete work faster than a single agent working sequentially — the same principle as sending three people on three errands at the same time instead of one person doing all three in a row
The Task tool is what allows an orchestrator to spawn subagents; each subagent starts with a completely blank slate and must receive all necessary context in a single, self-contained briefing
A well-briefed subagent instruction includes the project background, the exact task, the required output format, the word or length constraints, and any raw material the subagent will need — nothing can be assumed
Multi-agent pipelines cost more tokens than single-agent sessions because each agent processes text independently — the trade-off is speed, quality separation, and the ability to handle large or complex tasks
Multi-agent is worth the overhead for large, complex, parallel, or time-sensitive tasks; it is overkill for short, simple, or strictly sequential tasks
A task is ready to delegate when it has a clear bounded output, can be fully briefed without back-and-forth, and its handover points to the next stage are clean and unambiguous

Table of Contents