Skip to main content

Orchestration Improvement Collation

Sources synthesized: holistic review (msqc/reports/orchestration-holistic-review.md), team audit (msqc/reports/team-audit.md), Option B context + non-workflow options (orchestration-impl-b/reports/context-and-non-workflow-options.md), endstate architecture + summary (orchestration-impl-b/reports/endstate-architecture.ascii, endstate-summary.md), review-pass-1 (orchestration-impl-b/reports/review-pass-1.md), Option B code audit of PR #3482 + PR #3486 (open, not merged), session empirics (qa-multiplier / option-b-impl / option-b-review / heading-toggle / chip-fix), Anthropic official docs for workflows / agent-teams / sub-agents.


1. TL;DR

Five items every source agrees on

  1. Verifier → implementer feedback path is broken after workflow exit. Holistic P0-B, team-audit B1, endstate PR1-Rec-3, review-pass-1 (PASS verdict on PR1). Once feature-pr.js exits, the implementer subagent is gone; verifier prose forbids fixing; lead becomes mandatory broker. PR1 codifies the "Escalating Fixes Outside the Loop" section on all three verifiers + qc-investigator Paths A/B/C. No source dissents.
  2. The QC role was hallucinated; codify it. Holistic P0-C, team-audit B4, endstate PR1-Rec-2, session empirics (qa-multiplier did real P0 work + refused PR raise on a contract that didn't exist). PR1 adds .claude/agents/qc-investigator.md (161 lines). Empirically demanded; unanimously endorsed.
  3. tools: frontmatter must move from prose to harness-enforced lists on every agent. Holistic P0-A / Rec-A2, team-audit A1 / Rec4, endstate PR1-Rec-1. PR1 implements this on 11 agent files. Anthropic sub-agents doc echoes: "Limit tool access: grant only necessary permissions for security and focus." Unanimous.
  4. /raise ownership must be single-sourced to the persistent teammate, not duplicated in the workflow. Holistic Rec-A5, team-audit A3 / Rec5, endstate PR1-Rec-4. PR1 removes feature-pr.js Phase 4; ownership lives in code-implementer.md step 6. Unanimous.
  5. Workspace contract must be absolute path + isolation:'worktree' or it silently fails. Holistic Hooks-inventory + P0-A, team-audit F1/F2/F3 + Rec-F1, endstate Layer5-gap-2 + Anecdote-LeadIsolation (the implementer literally hit this gap during PR1 work). PR #3470 is the prerequisite; Option B layers assertAbsoluteWorkspace on top. Unanimous + empirically observed.

Three disagreements between sources

  1. Where universal gates live. Endstate-summary says "AGENTS.md (CLAUDE.md symlink)" and the Option B code audit confirms the section was added to AGENTS.md only — CLAUDE.md is unmodified, yet universal-gates.test.js opens CLAUDE.md and asserts the section. Holistic Rec-B5 explicitly said "CLAUDE.md". Verdict: bug in PR2. Either move the section to CLAUDE.md or change the test to read AGENTS.md. Holistic + review-pass-1 + the test itself favor CLAUDE.md; the actual diff favors AGENTS.md.
  2. Whether await import() works at the real Workflow runtime. Endstate-summary claims the lib is "canonical source of truth"; review-pass-1 P1-B says the pattern is unvalidated and the production code is in fact INLINED copies of the lib in every workflow. Session empirics confirm option-b-impl never drove a refactored workflow through the real Workflow tool. Verdict: review-pass-1 is right. The lib file is dead-at-runtime; only the test harness loads it.
  3. Domain-pointer richness in verifier prompts. Pre-Option-B, css-section and search-tuning verifier prompts named specific AGENTS.md anchors ("for verification standards and element selectors", "for the override header reference"). Option B collapsed these into a generic playbookClause. Holistic Rec-B2 endorsed the generic refactor; review-pass-1 P2-A flagged the regression; option-b-impl's iteration-1 then re-added a purpose arg. Verdict: review-pass-1 was right to flag. The purpose arg now exists, but the audit confirms callers in general-feature-pr.js may not be passing it consistently.

2. The Four Streams — Quick Status

Inline session retrospective. This session was simultaneously the venue and the subject. option-b-impl built PR #3482 + #3486 (Option B substrate) and self-verified through a Node test harness. option-b-review then independently re-verified and caught 2 P1 bugs the implementer missed (SHARED_WORKSPACE label drift, await import() unvalidated against real runtime), 2 P2 issues (domain-pointer regression, TeammateSpawned deferral undocumented), and 1 P3. qa-multiplier discovered a P0 storefront bug across the chip-multiplier incident; the lead had to spawn chip-fix as a broker because qa-multiplier (no agent definition) refused to raise the PR — the exact failure mode the audit was investigating. heading-toggle teammate did multi-pass investigation, self-corrected on the wrong-theme-ID mistake, and ended awaiting human approval. namespaced-tags produced research only.

Option B implementation (#3482 + #3486). PR1 (Option A tail) is contracts + dedup, no new abstractions: tools frontmatter on all 11 agents, verifier escalation sections, qc-investigator codification, /raise dedup, SKILL.md Phase 1.1 rewrite + isolation contract, audit D3 documented. PR2 (Option B core) adds the shared loop primitive (lib + inlined copies in 4 workflows), playbook arg, general-feature-pr workflow + skill, slim docs/dev/code-review-guide.md, universal gates section. Both PRs are OPEN — neither has merged. Audit confirms a real discrepancy: universal gates were added to AGENTS.md not CLAUDE.md, contradicting the test file and the holistic recommendation.

Prior workflow audits (holistic + non-workflow options + team audit). Three independent passes arrived at consistent verdicts. Holistic mapped 4 implementation options (A=docs-only, B=substrate, C=harness, D=investigator role) and recommended Steps 1-3 (PR #3470 + universal gates + qc-investigator → tools frontmatter + escalation sections + /raise dedup → extract loop primitive + parameterize playbook). Team audit named 28 specific findings with file:line citations. Non-workflow options laid out a 4-pattern taxonomy (full workflow / ad-hoc verifier subagent / persistent verifier teammate / monitor-pr-only) plus inline self-verify plus hybrid. context-and-non-workflow-options flagged a separate critical bug that the other streams missed: the fix-round prompt in feature-pr.js:114-123 (and lib/implement-verify-loop.js:128, general-feature-pr.js:80-87) drops spec, appGuide, workspace, and browserN entirely — round-2 sees only feature + verifier feedback. Verifier prompt also never gets spec, so it judges against CLAUDE.md + severity rubric with no scope contract.

Anthropic official recommendations. Workflows are for "more agents than one conversation can coordinate" — codebase audits, large migrations, cross-checked research, hard plans drafted from multiple angles. Anti-pattern: "No mid-run user input. For sign-off between stages, run each stage as its own workflow." Cost warning: "A single run can use meaningfully more tokens than working through the same task in conversation." Agent teams are explicitly experimental and gated on CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1; "no nested teams: teammates cannot spawn their own teams or teammates"; recommended scale 3-5 teammates. Subagents are for "quick, focused workers that report back" where "only the result matters"; cannot nest. The canonical decision axis is who decides the next step: workflows = script; subagents/skills = Claude turn-by-turn; agent teams = lead turn-by-turn with shared task list.


3. Convergence Map

RecommendationSources that agreeAction urgency
Add tools: frontmatter to every agent (matches AGENTS.md role table)Holistic Rec-A2/Rec-C2, team-audit A1/Rec4, endstate PR1-Rec-1, Anthropic sub-agents docP0 — in PR1, ready to merge
Codify qc-investigator role (replaces hallucinated qa-multiplier)Holistic Rec-A4/Rec-D1, team-audit B4/Rec3, endstate PR1-Rec-2, session empirics (qa-multiplier incident)P0 — in PR1, ready to merge
Verifier "Escalating Fixes Outside the Loop" section on code/css/search verifiersHolistic Rec-A3, team-audit B1/Rec2, endstate PR1-Rec-3P0 — in PR1, ready to merge
Remove feature-pr.js Phase 4; teammate owns /raiseHolistic Rec-A5/Step2-c, team-audit A3/Rec5, endstate PR1-Rec-4P0 — in PR1, ready to merge
Land PR #3470 (workspace placeholder + isolation mandate)Holistic Step1-a, team-audit Rec1/F1/F2/F3, endstate Layer5-gap-2 + Anecdote-LeadIsolationP0 — already in flight, prerequisite for both Option B PRs
Shared runImplementVerifyLoop primitiveHolistic Rec-B1, endstate PR2-Rec-1P1 — in PR2, blocked on await import resolution
playbook workflow arg parameterizing the domain guideHolistic Rec-B2, endstate PR2-Rec-3P1 — in PR2; needs purpose arg verified across callers
Universal gates section in project-wide doc (CLAUDE.md per most sources; AGENTS.md per actual PR2 diff)Holistic Rec-B5/Step1-b, endstate PR2-Rec-6, Layer0-gate-1..6P1 — in PR2 but in wrong file; see §4
Domain-neutral general-feature-pr workflow + skill + slim playbookHolistic Rec-B3, endstate PR2-Rec-4/5P2 — in PR2, demonstrates substrate composes
Drive at least one refactored workflow through the real Workflow tool before merging PR2Review-pass-1 Required-2 / P1-B, session empirics (no end-to-end invocation observed)P0 — gate on PR2 merge
Persist spec + context into fix-round prompts AND into verifier promptscontext-and-non-workflow-options Rec-1/2/3P0 — separate fix, not yet in either PR
Generalize hook scan to tmp/work/*/ for non-Shopify general-dev workHolistic Step4-deferred, endstate FutureDefer-1P2 — explicitly deferred
TeammateSpawned hook (typed workspace + isolation enforcement)Holistic Rec-C1, endstate FutureDefer-2 + Anecdote-LeadIsolation, review-pass-1 P2-BP2 — deferred to Option C; flagged in followups
Reversibility classification of actions (reversible / effortful / irreversible)Holistic Rec-C8 / blind spot #7, endstate Layer0-Per-domain (implicit)P3 — open question Q8 deferred
Team observability surface / team-status slash commandHolistic Rec-C4 / blind spot #8P3 — Option C
Idle-watchdog hook auto-pinging leadHolistic Rec-C5P3 — Option C
Knowledge persistence across sessions (audit findings out of gitignored tmp/)Holistic Q9P3 — open question

4. Divergence Map

Contested itemSource A saysSource B saysWhat we believe (session empirics)
Where universal gates liveHolistic Rec-B5 + universal-gates.test.js: CLAUDE.mdEndstate-summary + PR2 actual diff: AGENTS.mdCLAUDE.md is right. The test file already expects it there; CLAUDE.md is the project root and what every component reads; AGENTS.md is positioned as a Shopify playbook elsewhere in the repo. Action: move the section to CLAUDE.md (or symlink CLAUDE.md→AGENTS.md if that's the intent, but the test still fails today).
Does the lib await import() work at the real Workflow runtime?Endstate-summary: "canonical source of truth"Review-pass-1 P1-B: "unvalidated; production code is inlined"Review-pass-1 is right. option-b-impl never drove a workflow through the real Workflow tool; iteration-1 inlined the primitive. The lib is currently dead-at-runtime. Action: either drive one real invocation and confirm, or remove the lib + ship inlined-only with a follow-up.
Domain-specific anchors in verifier promptsHolistic Rec-B2: collapse to generic playbookClauseReview-pass-1 P2-A: keep per-prompt purposeBoth partially right. The generic clause is the substrate; a purpose arg restores domain-pointer richness. option-b-impl iteration-1 added playbookClause(path, purpose). Need to verify all 4 workflows actually pass purpose where their pre-Option-B code did.
Should verifiers be permitted to spawn implementer subagents at all?Team-audit Rec-B1-fix: yes, post-loop onlyHolistic Q1: leaves it open as "sandbox vs self-orchestration" tensionYes, post-loop only. PR1 codifies the tiering: in-loop = "verifier never writes"; post-loop = Agent without team_name permitted. tools frontmatter on verifiers now includes Agent. Empirically the lack of this caused the chip-fix incident.
Should /raise phase stay in workflow or move to teammate?Holistic Q3 + team-audit Q3 leave it openEndstate-summary + PR1 actual: moved to teammateMoved to teammate is right. Empirically, the workflow's Phase 4 was spawning anonymous subagents in the wrong worktree. PR1 removes it. Downside (teammate forgets to /raise) is mitigated by SKILL.md rule 8 wording.
Is qc-investigator Shopify-specific or general-dev?Holistic Q6: openEndstate: lives in .claude/agents/ (domain-neutral path) and is referenced in integrate-storefront/SKILL.md scope tableGeneral-dev primitive. Path C (invoke feature-pr) makes it useful for any flow; the agent file lives at the root .claude/agents/ not inside the skill. Treat as general-dev.
Plan approval: lead-only or human-too?Holistic Q2: open(no other source)Triage rule. Plans touching production / customer data / cross-component / over N files → human; rest → lead-only. The universal gates section in CLAUDE.md/AGENTS.md already enforces the production half.
Persistent verifier teammate (Option D) vs ad-hoc verifier subagent (Option B)Context-and-non-workflow-options decision guide: D for long sessions (5+ verifies), B for single moderate PRSession empirics: qa-multiplier mode-switched B→D organicallyThe mode switch is the right pattern. Don't pre-classify; spawn as ad-hoc, promote to persistent if recheck is needed across PR cycles.

5. Iteration-Cycle Inventory — Where We Are Right Now

Mapping session teammates to the 4-pattern taxonomy from context-and-non-workflow-options.md:

TeammatePatternDid it work?Token costLesson
option-b-implA (the workflow primitive was the artifact, never invoked end-to-end)Refactor landed; self-verification via Node harness MISSED 2 P1 bugs~0 spawn tokens (self-verify)When the workflow is the artifact, you cannot self-verify with a Node harness. You must drive at least one real Workflow invocation as smoke test.
option-b-reviewB (ad-hoc verifier subagent, one-shot Agent without team_name)YES — caught both P1s + 2 P2s the implementer missed~30-50kFresh-context independent verification is empirically the highest-yield cheap insurance for substrate / contract work.
qa-multiplier (initial)B (ad-hoc, no agent def — "hallucinated role")YES on discovery (P0 chip-multiplier root cause); FAILED on follow-through (refused /raise)~20-40kThe discovery side of B is genuine; the contract-boundary side is model-self-imposed. qc-investigator.md now backs both.
qa-multiplier (recheck)D (persistent verifier teammate via SendMessage)YES — confirmed E/F pass post-deploy~5-15k (amortized)Same teammate identity bridges B→D organically. Mode switch is empirical, not pre-classified.
chip-fixLead-as-broker fallback (NOT in 4-pattern taxonomy)Worked but is the failure mode~10-20kThis is the pattern PR1 was designed to eliminate. With qc-investigator Path B/C, qa-multiplier would have raised the PR itself.
heading-toggle teammateB (ad-hoc, no verifier dyad — investigation only)Self-corrected on wrong-theme-ID; no independent verifier caught the mistake~30-50kInvestigation-only B-pattern works for human-gated output but is fragile without a paired verifier.
namespaced-tags researchB (ad-hoc, research-only, no dyad)Produced plan; never implemented~10-20kAppropriate for plan-only work. No grading needed because nothing ships.

Pattern counts in session: A=0 confirmed end-to-end, B=4, C=0, D=1, mode-switch=1, lead-as-broker fallback=1.

Lessons (empirically grounded):

  • Pattern A is the most expensive AND the rarest in this session. It's the right tool for "fuzzy spec + customer-facing + ratchet needed" per the decision guide, but most session work didn't fit that profile. When you're modifying the workflow itself, A is structurally impossible without bootstrapping (you need a smoke-test invocation).
  • Pattern B dominated by a wide margin. It's the workhorse for both meta-work and live QA. The ~30-50k cost is acceptable when the alternative is "ship a bug that didn't get caught."
  • Pattern C (monitor-pr-only) was a layer, never the primary verification. Both Option B PRs got claude[bot] review on top of option-b-review's ad-hoc check. Treat C as belt-and-suspenders.
  • Pattern D emerges organically from B, not from pre-classification. Lead messages the same teammate again → it becomes persistent. The "decide upfront which pattern" mental model from the non-workflow-options table is too rigid.
  • Lead-as-broker is empirically the bottleneck. Chip-fix happened because qa-multiplier had no contract authority. PR1's qc-investigator + verifier-spawns-fix paths are the right fix.

6. Anthropic-Anchor Cross-Reference

Where we LINE UP with Anthropic recommendations:

  • Subagents for "quick focused workers that report back." PR1 verifier-spawns-fix (Agent without team_name, single-turn) maps exactly to Anthropic's subagent guidance: "results return to your main conversation" and "their context isn't seen by you." Used for tactical post-loop fixes, this is textbook.
  • Workflows for "more agents than one conversation can coordinate." feature-pr.js / css-section.js / search-tuning.js / general-feature-pr.js each drive an implement→verify→iterate loop with up to MAX_ROUNDS turns. The plan is in the script; intermediate results live in script variables. Anthropic: "A workflow moves the plan into code." We've done this.
  • Agent teams for "handful of long-running peers." integrate-storefront SKILL.md spawns 3-5 teammates per shop integration (css-implementer + css-verifier + finisher + code-implementer + code-verifier). Anthropic recommends 3-5; we're in the sweet spot. SKILL.md rule 4 ("teammates own their workflow invocations") matches Anthropic's "teammates need to share findings, challenge each other, and coordinate on their own."
  • tools: frontmatter for least-privilege. PR1's universal frontmatter implements Anthropic's explicit "Limit tool access: grant only necessary permissions for security and focus" guidance.
  • "For sign-off between stages, run each stage as its own workflow." PR1's /raise dedup + teammate-owns-/raise pattern is exactly this: feature-pr is one stage, /raise is the next stage, /monitor-pr is the third. We don't try to cram it into one workflow.

Where we DIVERGE intentionally:

  • Nested delegation via verifier-spawns-fix. Anthropic explicitly says "Subagents cannot spawn other subagents. If your workflow requires nested delegation, use Skills or chain subagents from the main conversation." Our Escalating Fixes Outside the Loop lets a verifier-class persistent teammate spawn a transient code-implementer subagent. This is technically allowed because the verifier is a teammate not a subagent — but it is one level deeper than Anthropic's recommended structure. Intentional, but worth noting: if Anthropic tightens "no nested" to include teammates, we'd need to reroute through the lead.
  • Long-running agent teams with experimental flag. Anthropic explicitly: "Agent teams are experimental and disabled by default." We've built an entire integrate-storefront skill around them. We accept the experimental risk because the empirical value is large (multi-shop work, parallel CSS sections).
  • Workflow used for moderate-spec verified PRs, not just dozens-to-hundreds of agents. Anthropic's headline use case is bigger than ours; we use workflows for ~3-5 agent rounds per invocation. The decision guide in context-and-non-workflow-options.md was written specifically to acknowledge that we use workflow at the bottom of its scale band.

Where we DIVERGE accidentally (misuse to fix):

  • feature-pr.js Phase 4 spawning an anonymous code-implementer in a possibly-wrong worktree. This violated Anthropic's subagent-lifetime model (subagents can't spawn other subagents; the workflow's last call effectively did just that). PR1 removed it.
  • The fix-round prompt dropping spec + appGuide + workspace. Anthropic on subagents: "Give teammates enough context — they don't inherit lead's conversation history." Our fix-round implementer subagent inherits NOTHING of the original scope. This is a flat-out misuse of the subagent context model. context-and-non-workflow-options.md Rec-1/2/3 are the fix.
  • The Node test harness as substitute for real Workflow invocation. Anthropic: "Background runs that persist while the session is responsive… runs count toward plan usage and rate limits." Our test harness is not the runtime. option-b-impl essentially asserted "my code compiles in Node, ship it." Review-pass-1 P1-B caught this.
  • Hooks scanning tmp/integrations/*/ only. Anthropic's recommendation is to use convention-based discovery. Our discovery is hard-coded to one path tier. For general-dev workflows, this is a quiet misuse — general-feature-pr writes outputs that hooks don't see.

7. Prioritized Next Steps

P0 — block-on-this-before-merging-anything-else

  1. Fix the CLAUDE.md vs AGENTS.md gate-location discrepancy. Either move the "Executing Actions with Care" section from AGENTS.md to CLAUDE.md (recommended; matches the test + holistic recommendation), or update universal-gates.test.js to read AGENTS.md. Owner: option-b-impl. Effort: 15 min. Why: The unit test currently fails against the actual diff; this is a self-inflicted incoherence.
  2. Drive at least one refactored workflow through the real Workflow tool before PR2 merges. Pick the smallest of feature-pr / css-section / search-tuning / general-feature-pr; invoke as Workflow({name: 'general-feature-pr', args: {...trivial spec...}}); confirm await import() resolves OR confirm the inlined copies execute. Log the run as evidence in PR description. Owner: option-b-impl (with option-b-review as ad-hoc verifier). Effort: 30-60 min. Why: Review-pass-1 P1-B; the lib is currently dead-at-runtime.
  3. Persist spec + context into fix-round AND verifier prompts. Implement context-and-non-workflow-options.md Rec-1/2/3. Round-2 implementer + verifier currently judge against CLAUDE.md + severity rubric with no scope contract — out-of-scope changes silently pass. Add args.context / args.outOfScope typed fields; stamp into every prompt INCLUDING fix-rounds; auto-write args.spec to ${workspace}/spec/${feature-slug}.md. Owner: option-b-impl. Effort: 1-2 hours including regression test (Rec-5). Why: This is the highest-severity bug NOT in either PR; it caused unnamed prior incidents implicitly (scope creep passing review).
  4. Land PR #3470 (workspace placeholder + isolation mandate) if not already merged. Prerequisite for PR1 + PR2. Owner: human / lead. Effort: review + merge.
  5. Verify playbookClause(path, purpose) is invoked with purpose at every site that used to have one pre-Option-B. Cross-check feature-pr.js / css-section.js / search-tuning.js / general-feature-pr.js verifier + fix prompts against pre-Option-B equivalents. Owner: option-b-impl. Effort: 30 min. Why: Review-pass-1 P2-A; partial fix in iteration-1, verify completeness.

P1 — finish-this-PR-cycle

  1. Add a smoke-test workflow that exercises the loop primitive on a no-op diff. Persistent regression guard against future workflow-runtime drift. Owner: new teammate (smoke-test) or option-b-impl. Effort: 1-2 hours. Why: Closes Test-Gap from review-pass-1.
  2. Document the verifier-spawns-fix contract with a test or example invocation. PR1 ships the prose contract; no test exercises Paths A/B/C of qc-investigator or the three verifier escalation sections. Add an example invocation pattern to SKILL.md or a fixture. Owner: new teammate. Effort: 1 hour.
  3. Resolve the orphan agents (plan-verifier, test-case-generator, project-clarity-interviewer). Three options: delete, integrate into a future general-dev plan-first workflow, or mark as deferred in docs/dev/orchestration-followups.md. Holistic Q7 + endstate OutOfScope-Orphans. Owner: lead decision. Effort: 15 min decision + 30 min execution.
  4. Lead CANNOT list in SKILL.md. Currently every role has CANNOT bullets except lead. Document "lead never does implementation work inline — always spawn a teammate" + "lead never invokes workflows — teammates own workflows." Holistic Q10 / blind spot #9. Owner: lead. Effort: 20 min.

P2 — Option C runway

  1. TeammateSpawned hook validating absolute ${SHARED_WORKSPACE} + isolation: 'worktree'. The empirically-observed gap from the Anecdote-LeadIsolation incident. ~30 lines of bash. Holistic Rec-C1 / Fix-P2B-Option1. Owner: new teammate (harness-hooks). Effort: 2-3 hours including test.
  2. Generalize hook scan from tmp/integrations/*/ to tmp/{integrations,work}/*/. Required for general-feature-pr to participate in the artifact-collection conventions hooks enforce. Owner: new teammate. Effort: 1 hour + regression test.
  3. Idle-watchdog hook auto-pinging lead after N turns of TeammateIdle. Holistic Rec-C5. Owner: new teammate. Effort: 2-3 hours.
  4. Team observability surface (/team-status slash command with JSON output). Holistic Rec-C4. Owner: new teammate. Effort: 4-6 hours.
  5. Verifier tool tiering (declare loop-scope tools vs escalation-scope tools). Resolves the cross-lens tension in Holistic Q1. Today the Agent tool is in the verifier's frontmatter unconditionally; tiering would make the in-loop CANNOT enforceable. Owner: harness work. Effort: TBD (depends on harness team).

P3 — open questions to schedule discussion on

  1. Reversibility classification (Holistic Q8) — one-tap approve / show diff / two-step confirm UX per tier.
  2. Knowledge persistence across sessions (Holistic Q9) — docs/integrations/learnings.md or auto-proposed AGENTS.md edits.
  3. Plan approval triage rule (Holistic Q2) — formal cutoff for human-required plans.
  4. Auto Mode precedence vs hard gates (Holistic Q5) — codify which gates Auto Mode does and does not override (universal gates already say "Auto Mode does NOT relax these").
  5. Mode-switch as first-class pattern (session lesson) — document the B→D promotion explicitly in context-and-non-workflow-options.md table or successor doc; today it lives only as session empirics.

End of collation.