Inline vs Team — Decision Criteria

When a lead session opens with a task, the first decision is do this inline as lead, or spawn a team? Today the call is implicit; this doc makes it explicit. Source: roadmap-ideas.md §2 + empirical pattern data from 2026-06-session-manual-interventions.md.

⚠️ NOT YET WIRED INTO ANY SKILL — TODO

No skill loads this file or instructs the lead to read it at session start. It's a forward-looking design doc that the upcoming domain entrypoint skills (roadmap-ideas.md §1) should reference. Until those skills land, the criteria here are guidance for humans reading this directly, not behavior the harness enforces on the lead at runtime.

When the integration work happens, every domain entrypoint skill (/integrate-storefront, future /pixel, /merchandising, etc.) should begin with "Read inline-vs-team-criteria.md and decide inline-vs-team before continuing."

TL;DR

Spawn a team if ANY of these are true:

Multiple independent workstreams (parallelizable tracks)
Production code change that needs a reviewer dyad (not a docs/config tweak)
Multiple concurrent worktrees needed (git isolation)
Expected work exceeds one session's context budget
Cross-PR or multi-session ownership needed

Otherwise do it inline.

The signals, in detail

1. Parallelizable tracks

If the task naturally splits into N independent pieces that can run in parallel — e.g. four CSS sections (cards / filters / paging / heading) for a storefront integration, or several agent definitions getting tools: frontmatter updates — that's a team. The wall-clock saved by parallelism is typically larger than the per-spawn overhead.

If there's only one stream, the team's hub-and-spoke routing is pure overhead: every reviewer round adds a SendMessage hop you don't need if you were already going to spawn an Agent subagent inline.

Rule of thumb: ≥ 2 independent tracks → team. 1 track → inline.

2. Reviewer-dyad need

The whole point of the implement-verify dyad is the reviewer-as-independent-observer property: a fresh-context agent reads the diff with no anchoring on the implementer's reasoning. That property is what the canonical inline-iterate loop encodes (Pattern B in iteration-patterns.md).

If the change is production code — anything that could ship a bug — you want the dyad. The lead CANNOT review their own work (see team-lead-protocol.md CANNOTs); the only way to honor that for a real code change is to spawn an implementer teammate + a reviewer subagent at minimum.

For pure docs reorgs, settings tweaks, or research artifacts, no dyad needed: do it inline.

Rule of thumb: Production code change → team. Docs/config/research → inline.

3. Git isolation

If you'll have multiple branches checked out concurrently — multiple implementers each working a different PR in parallel — you need separate worktrees, which means a team with self-provisioned worktrees per teammate (followups.md #2; the harness gap where isolation: "worktree" doesn't actually isolate is the reason for the self-provisioned directive in spawn prompts).

If only one branch is in flight, your single lead worktree is enough.

Rule of thumb: Multiple concurrent branches → team. One branch → inline.

4. Context budget

Long serial work that exceeds one session's context budget should be a team even if the work is single-stream — the teammate's separate context window buys you length. The collation/synthesis cost falls on the lead at the end, but the lead context stays clean during the implementation work.

Empirical heuristic: if the task involves reading more than ~5 large files or running an iterate-loop with > 3 review rounds expected, lean team.

Rule of thumb: > ~5 large file reads or > 3 expected review rounds → team. Otherwise inline.

5. Cross-PR / multi-session ownership

If the same set of decisions will be referenced in follow-up PRs, the team's escalations log + spec file + reviewer identity (Pattern D persistent reviewer) is the right durable substrate. A lead-only inline run leaves no artifacts behind beyond the PR description.

Rule of thumb: Will the next session need to read this team's escalations or rerun the reviewer? → team. One-and-done? → inline.

The trivial-task carve-out

Conversational tasks, single-file edits, settings curls (with human authorization), single-PR docs updates, one-off research — always inline. Spawning a team to flip a feature flag is overhead for its own sake.

Worked examples (from June 2026 sessions)

Task	Decision	Reason
Push MSQC's 13-facet `filter_config` update	Inline (lead curl + Opus QA subagent)	Settings push; 1 stream; reviewer subagent without team_name was enough
Orchestration uplift series (PR3 + PR5 + PR6 + PR3.5)	Team (4 implementer teammates)	Production code; 3 parallel tracks; needed git isolation
Diagnose stuck bulk-sync job for MSQC	Inline (lead curls + Shopify GraphQL)	Read-only; single stream; one-shot
Re-shape `docs/dev/orchestration/` directory (THIS PR)	Inline (lead, with human-approved layout)	Pure docs reorg; no production code; one PR
MSQC integration (initial CSS work)	Team (4 css-implementer teammates, one per section)	4 parallel tracks for the canonical 4 sections
Verify post-deploy that `productTagNs<key>` populated	Inline (lead `curl` to search proxy)	Read-only verification; one query

The B→D mode-switch case

The decision isn't always made once. If you start inline and the work turns out to need a persistent reviewer across several PR cycles, promote the ad-hoc reviewer subagent to a Pattern D persistent teammate (iteration-patterns.md §B→D). Don't pre-classify; let the work tell you when it crosses a threshold.

When in doubt

Default to inline if the criteria are mixed. Team spawning has real overhead — worktree setup, escalations file creation, hub-and-spoke message routing — that's only worth paying when you have at least one strong criterion above. Inline + a one-shot reviewer subagent (no team_name) is almost always lighter and reaches the same review-quality bar for moderate work.

⚠️ NOT YET WIRED INTO ANY SKILL — TODO​

TL;DR​

The signals, in detail​

1. Parallelizable tracks​

2. Reviewer-dyad need​

3. Git isolation​

4. Context budget​

5. Cross-PR / multi-session ownership​

The trivial-task carve-out​

Worked examples (from June 2026 sessions)​

The B→D mode-switch case​

When in doubt​

⚠️ NOT YET WIRED INTO ANY SKILL — TODO