Team Lead Protocol

What the lead session does, must not do, and how it runs the team.

⚠️ NOT YET WIRED INTO ANY SKILL — TODO

This file is not yet referenced by any existing skill. It was extracted from .claude/skills/integrate-storefront/SKILL.md so future domain entrypoint skills can reuse the lead protocol (per roadmap-ideas.md §1 tier-2 pattern), but the integration work hasn't happened yet:

integrate-storefront/SKILL.md still keeps its own copy of the Rules + CANNOTs sections (the source this file was extracted from).

The two copies have already diverged — for example, the auth-relay CANNOT (manual-interventions log entry 17) exists only here, not in SKILL.md. There is currently no parity check or pointer in either direction.

When future domain skills land (/pixel, /merchandising, /ecom-indexer, etc.), they should reference THIS file instead of inlining their own copy — and the integrate-storefront SKILL.md duplication should be resolved at the same time (replace with pointer, add a parity test, or pick a single source — TBD).

Until then: this file documents the intent; the source of truth a lead session actually loads via the integrate-storefront skill is SKILL.md. If you change rules here, also change them in SKILL.md (or — better — open a PR that wires SKILL.md to reference this file).

Tracking: followups.md (TODO: add "wire SKILL.md to reference team-lead-protocol.md + resolve duplication" as a deferred item in a follow-up PR).

This file holds the role-independent lead protocol. Domain-specific skills (integrate-storefront, future /pixel, /merchandising, etc.) layer their own gates and step lists on top — but the rules and "what the lead CANNOT do" sections below apply everywhere.

Before spawning a team — decide whether you should

Read inline-vs-team-criteria.md first. If the task fits the inline-as-lead criteria, do it inline. Otherwise continue through this protocol.

Rules

Never push to production without human approval. Always show the diff first. Applies to storefront settings, search config DDB writes, customer theme installs — anything the human can't trivially undo. See the universal gates in CLAUDE.md for the canonical list.
Never write implementation code yourself (CSS, code, search config). Teammates do that. See "What the lead CANNOT do" below for the reason.
Never verify a teammate's work yourself. The reviewer-class teammate (code-verifier, css-verifier, search-verifier) or a fresh Agent({subagent_type: "code-verifier"}) subagent does that. The independent-observer property is what makes the dyad work.
Always back up settings / state before and after pushing.
Always create the escalations task and mkdir the escalations directory before spawning the integration teammate. Each teammate writes its own escalations/<role-or-section>.json (initialized as [] by the teammate on first use). The lead doesn't pre-create the per-teammate JSON files — only the parent directory + the task. Convention from integrate-storefront/SKILL.md.
Always spawn teammates with team_name + name + isolation: "worktree", never as transient subagents. If you find yourself calling Agent(...) without team_name, that's a one-shot subagent — not what a long-running feature needs. Use one-shot subagents only for genuine one-shot research. Teammates share your cwd no matter what flags you pass (harness gap, see followups.md #2) — the self-provisioned-worktree directive in the spawn prompt is what actually protects git state.
Implement/verify loops run hub-and-spoke; workflows are lead-or-solo only. The lead spawns the teammate with the spec; the teammate implements and routes reviewer briefs through the lead. Note: the relevant agent definitions (code-implementer.md, code-verifier.md) list both Workflow and Agent in their tools: frontmatter — but the harness strips those tools when the agent is spawned as a teammate (i.e., with team_name), so teammates cannot in practice invoke them. This is a harness-level restriction, not a pending capability grant; see followups.md #7. The lead may invoke a workflow directly only for detached one-off work with no persistent-owner need — see CANNOTs below.
Clean up the team when all work is done. shutdown_request via SendMessage to each teammate, wait for shutdown_response, then tear down the team. Remove teammates' self-provisioned worktrees with git worktree remove <path> after their work is merged.

What the lead CANNOT do

Hard rules, not preferences. Mirrors the "What You CANNOT Do" sections in every agent definition under .claude/agents/.

Lead never does implementation work inline. Always spawn a teammate (persistent work) or a transient subagent (one-shot research). If you find yourself reading code to "just check one thing" and then editing, stop — that's a teammate's job. Why: the implementer-vs-verifier separation that drives the canonical inline-iterate loop depends on the lead being the conductor, not a player. If the lead writes code, no other role has the build context to verify it, and the team loses the reviewer-as-independent-observer property. (Pure docs reorg following a human-approved layout — like the PR that introduced this file — is the edge-case exception, not a precedent.)
Lead never substitutes a detached workflow for owned work. Teammates cannot invoke workflows (no Workflow tool), so the only session that can is you — and a Workflow({name: "feature-pr", ...}) run from the lead is detached from any persistent teammate: you lose post-deploy verification, PR review iteration, follow-up patches, and every other long-running concern. Invoke a workflow ONLY for genuinely detached one-off work where none of those follow-ups will ever be needed; for anything with a future, spawn a teammate and run hub-and-spoke. Why: the prohibition's point was never the tool call — it is that long-running work must have a persistent owner.
Lead never relays user authorization as if it were direct intent. When a teammate's classifier denies a capability change (tool widening, prod write, credential use) on "teammate-relayed authorization is not user intent" grounds, the lead does NOT solve this by repeating the user's quote in the teammate's mailbox. The user must type the authorization in the teammate's transcript (or land the change inline as lead, accepting the dogfooding hit). See 2026-06-session-manual-interventions.md entry 17.

How the lead runs the team

Hub-and-spoke dyad protocol

Most teammate-to-teammate routing today goes through the lead — review briefs, spec clarifications, and re-review requests. This is a consequence of the current harness behavior (teammates' Agent / Workflow frontmatter tools are stripped at spawn time; see followups.md #7), not a final design call.

Where teammates CAN spawn directly (post-loop escapes):

code-verifier.md's "Escalating Fixes Outside the Loop" section documents the preferred post-loop fix path: once the verifier's implement-verify loop is closed, the verifier (running as a one-shot subagent, not a teammate) can spawn a transient code-implementer subagent for a single tactical fix. This is now the recommended pattern for tail-end cleanup.
qc-investigator.md explicitly grants fan-out via Agent without team_name for the same reason.

The dyad-purity claim (independent observer) holds because both spawn paths are post-loop and one-shot — they don't re-introduce the implementer ↔ verifier identity collision the loop was structured to avoid.

Where the lead still acts as the broker: in-loop iterations, cross-teammate spec clarifications, and any case where a persistent identity needs the next message (Pattern D). When in doubt, route through the lead — the bot reviewer cost of a broker hop is lower than the cost of identity entanglement.

Practical implication: when an implementer finishes a round and needs review, they SendMessage to the lead with their PR/diff handle. The lead spawns a fresh reviewer subagent (or messages a persistent reviewer teammate, Pattern D) and routes the verdict back. The implementer never sees the reviewer's transcript or vice versa; both see only the lead's relayed brief/verdict.

Pattern selection per round

See iteration-patterns.md for the full taxonomy. The default for a feature PR is Pattern B (inline-iterate): implementer implements → lead spawns a one-shot fresh-context reviewer subagent → iterate up to 3 rounds → /raise → /monitor-pr. Promote to Pattern D (persistent reviewer teammate) when the same reviewer identity will be needed across multiple PRs.

Production gates

Show the human the diff before any production write. The list of "always gate" actions:

Merging PRs to main / master — humans merge.
Force pushes (--force, --force-with-lease) — never without explicit human ask. Never to main/master.
Production writes to customer / shared systems — storefront settings APIs, customer DDB writes, third-party app enables, search-config DDB writes, live theme installs, infra teardowns.
Destructive git — reset --hard, branch deletion, history rewrites, rm -rf on tracked dirs.
Third-party API writes outside dev cells.
Credential / secret access — only when the human has authorized that specific scope.

Auto Mode does NOT relax these. The list is canonical in CLAUDE.md's "Executing Actions with Care" section — this doc echoes it for the lead's quick reference; the canonical authority is CLAUDE.md.

Escalation handling

Each teammate writes to ${SHARED_WORKSPACE}/escalations/<role>.json (which must exist as [] before the spawn). The lead checks these between rounds. The escalation JSON shape and lifecycle is documented per-skill; the universal rule is: if a teammate cannot proceed without lead/human input, they log an escalation rather than guessing or blocking silently.

Pause / resume (token conservation)

When the human asks to pause (e.g. token budget), the lead:

Kills its own background watchers/polls.
Broadcasts to every live teammate: stop ALL background watches/polls, go fully dormant, no self-waking, no actions until explicitly resumed — and states who resumes them and roughly when.
Writes a PAUSE STATE section at the top of the session work log: what merged/changed up to the pause, what is mid-flight, and a per-member resume checklist (exact next action each teammate owes).
On resume: sweep what changed while dark (merges, new reviews), then dispatch each teammate its checklist line. Nothing is lost by pausing — merges/reviews that land during the pause simply queue as resume work.

Lead-side PR merge watcher

For long review tails, run a background poll of the open PR set that exits on the first state change (so the harness re-invokes the lead exactly when something merges). Caveats learned in production:

Transient gh failures are not state changes. Skip empty/error responses entirely; v1 of this watcher false-fired when a token refresh made every PR read as ERR for one poll cycle.
Relaunch the watcher after each firing; treat watcher death as a signal to check gh auth status before trusting the next read.
With automerge + bot approvals, PRs merge while no human is present — the watcher is what keeps stacked-PR retargets (see stack-upkeep.md) flowing without polling teammates.

Crossed-message discipline

Lead↔teammate messages routinely cross mid-turn (a teammate reports "awaiting verdict" after the verdict was already sent). Rules that keep this harmless:

Directives must be idempotent and re-sendable: include the decision AND its key facts every time (verdict + SHA + what to do next), so a re-send costs nothing and a stale report can be answered by repeating the directive verbatim.
Teammates should act on the newest instruction in their inbox, not the first unread one, when they conflict.
The lead should treat "already done — crossed in flight" replies as normal, verify the claimed end state cheaply (one gh/git read), and move on rather than re-litigating.

Reviewer identities (who is a bot)

Know which review signals are bots before treating them as human gates: GitHub user ishaaq posts bot-generated reviews (claude-fable-5 with approve powers delegated by the human ishaaq); claude[bot] is the auto-reviewer; comments tagged reviewed-by-codex under a human account are Codex-generated. Bot approvals satisfy branch protection and fire automerge, but the human merge gate (CLAUDE.md) is whoever merges or enables automerge. Address bot findings on their merits — in the June 2026 uplift they caught real defects (a NaN gate hole, an authz gap) — and push back where wrong, exactly as with verifier findings.

Team shutdown

SendMessage(to=teammate, message={type: "shutdown_request"})
# wait for shutdown_response with approve: true
git worktree remove <teammate's path>  # if it self-provisioned one
TeamDelete  # only after all teammates have shut down

What this doc deliberately does not cover

Per-domain step lists. Those belong in the domain skill (e.g. integrate-storefront/SKILL.md for Shopify storefront work). This doc is the role-independent layer.
Workflow internals. Substrate is documented in iteration-patterns.md Pattern A, and the code in .claude/workflows/lib/.
Reviewer specifics. See code-review-guide.md and the agent-def for each verifier role.
Inline-vs-team triage. See inline-vs-team-criteria.md.

⚠️ NOT YET WIRED INTO ANY SKILL — TODO​

Before spawning a team — decide whether you should​

Rules​

What the lead CANNOT do​

How the lead runs the team​

Hub-and-spoke dyad protocol​

Pattern selection per round​

Production gates​

Escalation handling​

Pause / resume (token conservation)​

Lead-side PR merge watcher​

Crossed-message discipline​

Reviewer identities (who is a bot)​

Team shutdown​

What this doc deliberately does not cover​

⚠️ NOT YET WIRED INTO ANY SKILL — TODO