Iteration Patterns
How an implementer + reviewer iterate on a code change. Five patterns are in use across this repo; the right one depends on scope, lifetime, and what you're protecting against. This page is the canonical taxonomy — agent definitions, skills, and the code-review guide all link here rather than restate it.
Quick choice
| Situation | Use |
|---|---|
| One PR, moderate spec, persistent teammate doing the work | B — inline-iterate (default) |
| Multi-section parallel fan-out (e.g. 4 CSS sections) | A — workflow |
| Repeated tuning rounds with hard ratchet (e.g. search-tuning) | A — workflow |
| Same reviewer concerns recur across multiple PRs in the same session | E — B→D mode-switch |
| Two long-lived peers messaging each other (implementer + dedicated reviewer teammate from the start) | C — pair-driven |
| Long-lived reviewer reviewing many sequential PRs from the same implementer | D — persistent reviewer teammate |
When in doubt, start with B.
A — Workflow (scripted)
A deterministic Node script (feature-pr.js, general-feature-pr.js,
css-section.js, search-tuning.js) orchestrates the loop:
script.initialImplementPrompt → spawn implementer subagent (one turn)
script.verifyPrompt → spawn verifier subagent (one turn), structured {pass, feedback}
if pass: exit
else: script.fixImplementPrompt(feedback) → spawn implementer subagent again
repeat up to maxRounds (3 for feature-pr / css-section, 5 for search-tuning)
- Lifetime: subagents are one-shot per spawn; the script holds state between rounds. The persistent session that invoked the workflow is what survives.
- Strengths: deterministic, max-rounds bound, structured output (good for downstream consumers), good for fan-out (one workflow per section running in parallel).
- Weaknesses: rigid; the verifier's checklist is whatever the script's
verifyPromptsays. Round-2+ prompts historically dropped the spec (fixed in PR3 — scope contract is now stamped into every prompt-builder). Lib helpers + tests at.claude/workflows/lib/. - Who invokes it: harness reality (2026-06-10/11): teammates do NOT have
the
Workflowtool, so only the lead or a plain non-team session can invoke a workflow — and a lead-invoked run is detached from any persistent teammate. Use it only for genuinely detached one-off work; for owned work, Pattern B-hub below. Seeintegrate-storefrontSKILL.md "What the lead CANNOT do".
B — Inline-iterate (default)
The persistent teammate writes code itself, then spawns an independent
reviewer subagent with Agent({subagent_type: "code-verifier", ...}) —
no team_name, fresh context, one turn. (The agent definition is
code-verifier; "reviewer" is the prose name used here.)
teammate edits files
teammate spawns reviewer subagent ({diff, spec, project gates})
reviewer returns {pass, feedback}
if pass: teammate /raise + /monitor-pr
else: teammate fixes, spawns a NEW reviewer subagent (fresh context, not the same one)
repeat up to 3 rounds; if still failing, SendMessage lead
-
Lifetime: the teammate is persistent; each reviewer subagent is one-shot. Each round gets a fresh subagent — that's the whole point. The fresh context is what makes the review independent.
-
Strengths: cheap (~30-50k tokens per spawn), catches real bugs workflow self-verify misses (session empirics: B caught 2 P1s + 2 P2s the workflow's Node-harness self-verify missed). Implementer has full build context; reviewer has fresh-eyes context. No script rigidity.
-
Weaknesses: no max-rounds enforcement except by convention. Reviewer has no memory across rounds — for some review concerns that's a feature (independence), for others it's a bug (see Pattern D / E).
-
Cross-PR knowledge does NOT transfer. Fresh subagents have no memory of prior PRs' findings. Empirically (PR #3499 dogfooding):
- PR #3500 fixed a path-traversal bug class for one slug derivation; two
rounds of independent reviewers on PR #3499 then BOTH missed the same
class of bug in adjacent slug derivations — auto-reviewer
(
claude[bot]) caught it on round 3. - PR #3486 fixed an
await import()runtime risk by inlining helpers into each workflow script; PR #3499 then reintroduced anawait import("node:fs/promises")at the top level of every script — three rounds of independent reviewers missed it;claude[bot]caught it on round 4 with the exact same reasoning PR #3486 had used.
Implication: if a class of bug or risk recurs and you want reviewer-side memory, switch to Pattern D or E. Otherwise budget for late-stage auto-reviewer catches — and accept that the auto-reviewer's persistent cross-PR memory (knowledge of recent fixes on the same repo) is the practical substitute for any "shared review brain" you'd want here.
- PR #3500 fixed a path-traversal bug class for one slug derivation; two
rounds of independent reviewers on PR #3499 then BOTH missed the same
class of bug in adjacent slug derivations — auto-reviewer
(
-
Mitigation — the review ledger. The teammate appends every reviewer verdict to
${SHARED_WORKSPACE}/reviews/<task-slug>.mdand includes a "CONFIRMED FINDINGS SO FAR" section in each fresh reviewer's brief (the canonical brief template lives incode-implementer.mdstep 3). Confirmed findings transfer between rounds; independence of judgment is preserved. Workflow scripts (Pattern A) append to the same ledger automatically. -
Who invokes it: the teammate (default for
code-implementer). -
Harness reality (2026-06-10/11): teammates have no
Agenttool, so a teammate cannot literally spawn the reviewer. The working variant is B-hub (hub-and-spoke): the teammate sends the reviewer brief to the LEAD via SendMessage; the lead spawns the fresh reviewer subagent and relays the verdict. Same independence properties (fresh context, new reviewer per round, ledger) — the lead is a message bus, not a judge. The direct form applies when the implementer is itself a non-team session or transient subagent.
C — Pair-driven (two long-lived peers)
Two persistent teammates spawned at session start: one implementer, one
reviewer (both addressable via SendMessage). They iterate by messaging
each other directly; the lead optionally observes.
- Lifetime: both teammates live for the team's lifetime, go idle between turns.
- Strengths: reviewer accumulates context across review rounds — knows the spec, prior trade-offs, and which findings were already discussed. Good for non-trivial features with many rounds where re-explaining context to a fresh subagent each round is the bottleneck.
- Weaknesses: reviewer is no longer "independent" by round 3 — they share an internal model with the implementer of "what's OK here." That failure mode (verifier-as-collaborator vs. verifier-as-judge) is the reason Pattern B is the default. Use C only when you specifically want the accumulating-context property.
- Who invokes it: the lead, at session start.
D — Persistent reviewer teammate
A long-lived reviewer teammate that reviews many sequential PRs from the same implementer (or implementers). Unlike C, the reviewer is not paired with one implementer; they're a standing capability.
- Lifetime: reviewer teammate alive for the team's lifetime.
- Strengths: institutional memory across PRs — the reviewer remembers prior decisions, recurring concerns, the merchant's gotchas. Good for long-running integration sessions where one teammate reviews CSS, code, and search tuning PRs across days.
- Weaknesses: same independence drift as C, just slower. Reviewer can drift into approving anything the implementer raises if they share too much context. Mitigate by periodic fresh-subagent spot-checks (basically Pattern B as an audit on Pattern D).
- Who invokes it: the lead, when scope warrants.
E — B → D mode-switch
The interesting one. Start with B (inline-iterate, ad-hoc reviewer subagent). If during a session the same review concerns recur across multiple PRs from the same teammate — or the work clearly spans multiple PRs that would benefit from shared review context — promote the ad-hoc reviewer subagent to a persistent reviewer teammate.
Mechanic: spawn an Agent with team_name + name + subagent_type: "code-verifier" (i.e., promote from subagent to teammate). The new
teammate is "instantiated from" the prior subagent's last review output
— pass it the prior review's verdict + reasoning in the spawn prompt so
the persistent reviewer's first turn starts where the last subagent's
turn left off.
Empirical evidence
From session-2025-12 / tmp/integrations/orchestration-impl-b/reports/orch-improvement-collation.md §5:
- Pattern B dominated (4 instances end-to-end vs. A=0, D=1, lead-as-broker=1).
- One session-long task (Okendo + Swym + CSS finishing) spawned 7 independent code-verifier subagents — 5 of them flagged the same P2 concern (inconsistent error wrapping pattern). The fix was the same each time; re-explaining context cost ~30k tokens per spawn.
- Switching to a persistent reviewer teammate (D) for the rest of that session caught a regression on PR #4 that a fresh subagent would likely have missed (the regression undid a fix the reviewer had explicitly asked for on PR #2). Net: 5x fewer review tokens, one caught regression that B alone would have shipped.
When to switch
Concrete signals (any one is enough):
- The same finding shows up in 3+ consecutive reviewer subagent outputs across different PRs in the session — fresh context isn't buying you independence, it's costing you tokens.
- The implementer is touching a long-lived subsystem where review depends on understanding evolving constraints (e.g., merchant integration where PRs land over days and reviewer needs to remember "this shop has X").
- You're entering a final-mile sprint with 5+ PRs left and the same reviewer profile each time.
When to switch back
If the persistent reviewer starts rubber-stamping (no P0/P1 findings on two consecutive PRs that prior fresh subagents would have flagged), spot check with a one-off B-style fresh subagent review on the next PR. If it finds something the persistent reviewer missed, that's the signal to shorten the persistent reviewer's lifetime or rotate to a new one.
Cross-reference
- For Shopify integration work:
/docs/integrations/AGENTS.md§Canonical Loop - For general dev work:
/docs/dev/orchestration/code-review-guide.md§Entry point - Workflow scripts:
.claude/workflows/{feature-pr,general-feature-pr,css-section,search-tuning}.js - Shared loop primitive:
.claude/workflows/lib/implement-verify-loop.js - Agent definitions:
.claude/agents/{code-implementer,code-verifier}.md - Lead's CANNOT rules:
.claude/skills/integrate-storefront/SKILL.md§What the lead CANNOT do