Tech 14 min read

Run multiple agents in parallel with GitHub Copilot CLI's `/fleet` command

IkesanContents

GitHub Copilot CLI now has a /fleet command. Give it one prompt and an internal orchestrator breaks the work into pieces, then dispatches independent chunks to multiple subagents in parallel. It is a good fit for tasks that split naturally, such as multi-file refactors or bulk documentation generation.

Copilot CLI had only recently reached GA in February 2026, and it already had built-in agents like Explore, Task, Code Review, and Plan, plus the & key for cloud delegation. /fleet is the next step: a command specifically built for parallel execution.

The Orchestrator’s Five Steps

When you run /fleet, the main agent acts as an orchestrator and performs the following steps:

graph TD
    A[Receive prompt] --> B[Split into tasks]
    B --> C[Analyze dependencies]
    C --> D[Dispatch independent tasks<br/>in parallel]
    D --> E[Poll for completion]
    E --> F{Any downstream tasks?}
    F -->|Yes| G[Dispatch the next wave]
    G --> E
    F -->|No| H[Verify and integrate results]
  1. Split the prompt into discrete work items
  2. Decide which items can run in parallel
  3. Dispatch independent items as background subagents
  4. Poll for completion and release dependent tasks in later waves
  5. Verify all outputs and integrate the final result

The key idea is the “wave.” Tasks with dependencies wait for upstream work to finish before the next wave starts. This is not flat parallelism; it is DAG-style scheduling.

How to Use It

The basic syntax is simple: just write what you want after /fleet.

/fleet Refactor the auth module, update tests, and fix the related docs in docs/auth/

For non-interactive shells, use:

copilot -p "/fleet <YOUR TASK>" --no-ask-user

Prompting Determines Parallelism

/fleet only works well when the prompt is written in a way that can be split. If the instruction is vague, the orchestrator cannot break it apart and will fall back to a more sequential flow.

An example of a vague prompt:

/fleet Rewrite all of the documentation

A more explicit prompt is much easier to split:

/fleet Create API docs for these four files:
- docs/authentication.md: explain the OAuth2 flow
- docs/endpoints.md: list every endpoint
- docs/errors.md: provide the error codes and fixes
- docs/index.md: link to the three files above and summarize them (generate this only after the first three are done)

In the second case, the orchestrator can generate authentication.md, endpoints.md, and errors.md in the first wave, then create index.md in the second wave after the dependencies are ready. Declaring dependencies explicitly is what lets the scheduler do the right thing.

The official blog says prompts should include the following:

ElementEffect
File/module boundariesMakes each subagent’s scope clear
Constraints, such as “do not touch X”Prevents unintended changes
Validation criteria, such as linting, type checks, and testsLets subagents enforce quality on their own
Dependency declarationsLets the orchestrator schedule waves correctly

The File-Conflict Problem

/fleet has one serious limitation: there is no file locking between subagents. If multiple subagents write to the same file, the last one wins, silently. No error. No warning.

There are two ways to deal with that.

One is to assign clearly separate files to each subagent and say so in the prompt. Another is to have each subagent write to a temporary path and let the orchestrator merge everything at the end. The second option is possible, but the prompt design becomes much more complicated.

In Claude Code’s multi-agent review, the parallel agents leave their output as separate comments, so file conflicts do not arise structurally. /fleet actually edits files, so this problem becomes very real.

Context Sharing Limits

There is another important limitation: subagents do not inherit the orchestrator’s chat history. Anything you discussed with Copilot CLI before /fleet - for example, “this project uses React 19” or “tests are written with Vitest” - will not be visible to the subagents.

That means the /fleet prompt must be self-contained. Put all required context in the prompt itself, or point the agents at the files they should read.

Working With Custom Agents

You can call custom agents defined under .github/agents/ from /fleet by using @agent-name syntax.

/fleet @technical-writer.md to update docs under docs/,
and @test-writer.md to add tests under tests/

By default, subagents use a low-cost model. Custom agent profiles can specify a model directly, which matters for cost. Each subagent consumes premium requests, so parallelism can burn through your quota quickly.

At GA time, Copilot Free had 50 premium requests per month, Pro had 300, and Pro+ had 1,500, with overage billed at $0.04 per request. Model multipliers also apply, and Claude Opus-class models cost 3x. If you run five /fleet subagents on Opus, one run can easily burn more than 15 requests.

Check Progress With /tasks

While /fleet is running, the /tasks command shows the list and status of background subagents. It is worth checking the orchestrator’s plan before execution to make sure the work has actually been split into the intended tracks.

What It Is Good For

/fleet shines on tasks with natural parallelism.

Good fits:

  • Refactors that span multiple files
  • Bulk documentation generation by component
  • Feature work where API, UI, and tests can be written independently
  • Multiple unrelated bug fixes

Bad fits:

  • Changes confined to one file
  • Strictly linear work where the next step depends on the previous step’s result

For the second category, a regular Copilot CLI prompt is usually simpler and faster.

Comparison With Other Agent Modes

Copilot CLI is not the only terminal-based AI coding tool with parallel agents. Compared with Claude Code’s Agent Teams and Cord (the Claws concept), the context-sharing and file-isolation designs are clearly different.

ItemCopilot CLI /fleetClaude Code Agent TeamsCord (Spawn/Fork)
Context inheritanceNone. Prompt onlyAuto-loads CLAUDE.mdSpawn: only dependency outputs / Fork: all sibling results
Agent-to-agent communicationNoneJSON inboxSQLite dependency graph
File conflict preventionNone (silent overwrite)OS process separationMCP scope control
OrchestrationAI auto-splits and runs wavesUser defines it explicitlyAgents design it at runtime
PersistenceNone (session-only).claude/ filesystemSQLite
Visibility into state/tasks lists everythingParent receives only final outputSQL queries are possible
Required planCopilot Free or higher (premium-request based)Claude Pro or APINone (OSS + any API)

The Difference in Context Sharing

The biggest difference in multi-agent execution is how much context reaches the subagents.

/fleet does not pass the chat history to subagents at all. Only the prompt text reaches them. Project conventions such as “use Vitest” or “prefer relative imports” must be repeated in the prompt or baked into custom agents under .github/agents/.

Claude Code’s Agent Teams solves part of that by auto-loading CLAUDE.md at startup, so project rules are shared implicitly. Since v2.1.19, it also supports disk-based JSON inboxes (~/.claude/<teamName>/inboxes/<agentName>.json) for agent-to-agent messaging. Agents can pull tasks from a shared list and send results directly to each other. Because it uses the filesystem as the IPC layer, messages do not disappear if an agent crashes.

graph LR
    E[CLAUDE.md] -.->|Auto load| F[Agent 1]
    E -.->|Auto load| G[Agent 2]
    F <-->|JSON inbox| G
    H[Shared task list] --- F
    H --- G

Cord’s Spawn/Fork model is even more granular.

ModeBehavior
SpawnChild tasks receive only the output of their declared dependencies. Good for independent parallel work
ForkChild tasks inherit the results of all sibling tasks. Good for aggregation and decision-making

The MCP server enforces scope, so child tasks are structurally blocked from accessing resources outside their assigned boundary. All tasks, dependencies, and outputs live in SQLite, which means you can resume after a crash.

File Isolation Approaches

The other big issue, just as with context sharing, is file conflicts when multiple agents edit the same repo.

/fleet has no file-locking mechanism, so simultaneous writes to the same file can be overwritten silently. As noted above, the only real mitigation is to separate file ownership in the prompt.

Claude Code introduced VM separation in Cowork using Apple’s Virtualization.framework. Each agent runs inside its own VM and talks to the host over vsock (Virtio Socket), so file-level conflicts cannot happen by design. The tradeoff is cost: each VM consumes 10GB or more of disk.

Cord uses MCP scope control to limit which resources an agent can touch, which structurally blocks unintended writes.

ToolIsolation modelFile conflict riskCost
/fleetProcess separation onlyHigh (silent overwrite)Low
Agent TeamsOS process separationMedium (depends on design)Low
Cowork VMVM separation (vsock)NoneHigh (10GB+/VM)
CordMCP scope controlLow (structurally blocked)Low

Plan Requirements and Cost

Beyond the feature comparison, what you can actually use depends on your plan.

/fleet is available on all Copilot plans, including Free, but every subagent burns premium requests. Free’s 50 requests per month is nowhere near enough for serious parallel work, and even Pro’s 300 can disappear quickly depending on the model. Pro+ (39/month,1,500requests)oroveragebilling(39/month, 1,500 requests) or overage billing (0.04/request) is the realistic range.

Agent Teams works wherever Claude Code is available. You need Pro (20/month)orAPIaccess;Freedoesnotincludeit.Prohasusagelimits,andparallelagentsmultiplytokenconsumption,soinpracticeMax(20/month) or API access; Free does not include it. Pro has usage limits, and parallel agents multiply token consumption, so in practice Max (100-$200/month) or direct API usage is often the better fit.

Cord is open source, so there is no plan gate. The only variable cost is LLM API usage.

All three tools claim to support multi-agent parallelism, but none of them are really practical on a free or low-cost plan. Parallelism scales request and token usage linearly, so treat these features as mid-tier or higher.

The Design Tradeoff

/fleet decides how to split work and run waves from a single prompt. It is the easiest to use because the user does not need to think about agents explicitly. The price is that the user has to manage context loss and file conflicts.

Agent Teams uses CLAUDE.md plus inter-agent messaging to create more structured collaboration. It takes more setup, but it is less likely to miss project conventions. It is especially good for workflows where each agent’s output is separated as an independent comment, like parallel PR review.

Cord gives you the finest-grained control over context. Spawn and Fork let you tune how much context each task receives, and MCP-based permissions add a security layer. The downside is that it is still a small Python + SQLite proof of concept, not a production product yet.

Which one is right depends on the task and the plan. For a simple “generate these four docs in parallel” job, /fleet is enough. If your project has complex conventions and needs coordination, Agent Teams fits better. If you need strong security isolation, Cord’s scope model or Claude Code’s VM separation are better options. In any case, check the real cost of running multiple subagents before you rely on it.

Codex CLI, Gemini CLI, and Open-LLM Tooling

So far this comparison has covered Copilot CLI, Claude Code, and Cord, but there are other terminal-based AI coding tools too. Here is where Codex CLI, Gemini CLI, and open-LLM tooling stand on parallel multi-agent execution.

Codex CLI: Subagents Separated by git Worktree

OpenAI’s Codex CLI has a Subagents feature. If you tell it in natural language to “spawn agents for each point,” multiple subagents start in parallel. Unlike /fleet, which auto-splits from a prompt, Codex CLI expects the user to explicitly ask for parallelism.

Its biggest distinction is file isolation. Each Codex CLI subagent runs in a separate git worktree, so multiple agents cannot write to the same files in the same checkout. That is a lot safer than /fleet’s silent overwrite model.

graph TD
    A[User instruction] --> B[Main agent]
    B --> C[Sub 1<br/>worktree A]
    B --> D[Sub 2<br/>worktree B]
    B --> E[Sub 3<br/>worktree C]
    C --> F[Merge results]
    D --> F
    E --> F

For management, /agent lets you list, switch, and stop active agent threads. codex exec also supports non-interactive mode for CI/CD. Codex CLI can even expose itself as an MCP server so you can orchestrate multiple agents from the OpenAI Agents SDK.

That said, Codex CLI does not auto-generate task decomposition the way /fleet does. /fleet can infer a DAG and run waves automatically, while Codex CLI leaves the parallelization design to the user or to the SDK code. /fleet is the easy one; Codex CLI is the safer one.

Gemini CLI: The Core Tool Does Not Parallelize Yet

Gemini CLI itself does not have a multi-agent parallel execution feature. It does have subagents, but tool calls are still sequential, so they cannot run in parallel yet. GitHub issues exist for parallel subagent execution (#17749, #14963), but file-conflict handling is still under development. There is also a feature request for a Claude Code Agent Teams-style capability (#19430).

The ecosystem around it is trying to fill the gap.

ExtensionSummary
Conductor (Google official)A reference implementation for Context-Driven Development. It offers 2-agent parallelism and uses Write-Lock to avoid file conflicts, but full-fledged multi-agent parallelism is still at the proposal stage (Issue #66).
Maestro (community)An orchestration setup where a TechLead agent coordinates 22 specialized subagents

Gemini CLI’s biggest strength is that it gives free access to Gemini 2.5 Pro with a 1M-token context window. Cost-wise, it is much easier to start with than other tools. But until parallel agents land in the core product, it is hard to compare it directly with /fleet or Agent Teams.

Open-LLM Tooling: No Native Parallel Orchestration Yet

Qwen Code, the official Alibaba tool with 21.6k GitHub stars, is a Claude Code-like terminal agent with built-in SubAgents and Skills. But it does not have fleet-style parallel orchestration. Aider, with 42.7k stars, is similar: it has a two-stage architect/code workflow, but parallel agent execution is still just a feature request on GitHub.

At the moment, I could not find a CLI agent in the open-LLM world that natively includes something like /fleet. To get parallel execution, you have to rely on an external orchestrator.

ToolStarsSummary
Emdash3.6kA desktop app that runs 23 CLI agents, including Claude Code, Qwen Code, and Codex, in isolated git worktrees
AgentsMesh1.2kA platform for running and coordinating multiple CLI agents on remote workstations
mco263A lightweight orchestration layer that coordinates any CLI agent independently of the IDE

All of these are wrappers around existing CLI agents rather than agents themselves. The tooling around parallelism has not caught up with the quality of the models.

Full Comparison

Putting Copilot CLI, Claude Code, Cord, Codex CLI, and Gemini CLI side by side gives a clearer picture.

ItemCopilot CLI /fleetClaude Code Agent TeamsCordCodex CLI SubagentsGemini CLI
Automatic task splittingYes (DAG waves)No (user-defined)Agent designs itNo (explicit instruction)Not yet
File isolationNone (silent overwrite)OS processesgit worktreeMCP scopeNot yet
Context inheritancePrompt onlyAuto-loads CLAUDE.mdDependency graphPrompt only
Agent-to-agent communicationNoneJSON inboxSQLiteNone
State visibility/tasksParent gets final outputSQL queries/agent
Costpremium-request basedPro or APIOSS + API pricingAPI pricingFree (Gemini 2.5 Pro)

From a file-conflict perspective, Codex CLI’s git worktree isolation is the cleanest implementation today. For automatic task splitting, /fleet is still the only fully automatic one. Gemini CLI is cheapest, but it does not yet have native parallelism.

The bottom line is that multi-agent parallel execution is still a mid-tier or premium feature across the board.