Skip to main content

Code Review Guide (General Dev)

Environment assertion (mandatory preamble for every verifier round)

Before reporting ANY measurement (test counts, bundle sizes, diffs), assert and PRINT in your report:

  1. pwd — you are inside the assigned worktree/checkout, not the lead's shared checkout;
  2. git rev-parse HEAD — it equals the SHA in your brief;
  3. one expected marker from the brief (e.g. "expect ~691 tests; if you see the stale 595 baseline you are in the wrong tree").

If any assertion fails, fix your environment before measuring — do not file findings from the wrong tree. Rationale: in the June 2026 uplift, two verifier rounds produced false "corrections" (stale test counts, wrong bundle sizes, "missing" test files) because a path or glob silently resolved to the lead's checkout; a third self-caught it only via this gate. Reviewing a SHARED worktree is also unsafe — the implementer may switch its checkout mid-review; verify from the branch ref in your own throwaway worktree instead.

DRAFT — please review and expand. This is the slim non-Shopify equivalent of docs/integrations/AGENTS.md, authored alongside the shared implement-verify-loop primitive so the general-feature-pr skill has a domain-neutral playbook to point at. It is intentionally minimal — the goal is to demonstrate composability, not to be canonical.

This guide is read at the start of every task by the code-implementer and code-verifier agents when they are driven by the general-feature-pr workflow (or any workflow that passes playbook: docs/dev/orchestration/code-review-guide.md as an arg). It is the general-dev equivalent of the Shopify-specific docs/integrations/AGENTS.md.

For Shopify storefront integration work, use the Shopify playbook instead.


Documentation Discovery

When the task is a bug report, operational investigation, customer-impacting behavior change, or anything that crosses service boundaries, start by checking the runbook map before reading code from scratch:

Leads should pass the most relevant runbook path in the teammate brief. If they do not, investigators and implementers should still consult the diagnostics index when the symptom is operational rather than a narrow code diff.


Entry point: inline-iterate first

When a persistent teammate is doing feature/integration code work, the default loop is NOT a workflow. It is:

  1. Persist the spec to ${SHARED_WORKSPACE}/spec/<task-slug>.md at task start (workspace root tmp/work/<slug>/ for general-dev work).
  2. Implement inline in the teammate's own worktree.
  3. Spawn an independent reviewer subagent (Agent without team_name, fresh context, one turn). Build the prompt from the reviewer-brief template in .claude/agents/code-implementer.md §3 — scope contract, diff ref, implementer report, confirmed-findings-so-far. Don't improvise the brief.
  4. Record each verdict in the review ledger (${SHARED_WORKSPACE}/reviews/<task-slug>.md).
  5. Iterate up to 3 rounds with fresh reviewer subagents, passing the ledger's confirmed findings into each new brief.
  6. /raise + /monitor-pr (teammate-owned).

Harness reality: TEAMMATES cannot run steps 2-4 directly — team-spawned agents have no Agent/Workflow tools. Teammates route the review through the lead (hub-and-spoke): SendMessage the lead the reviewer brief; the lead spawns the code-verifier and relays the verdict. See .claude/agents/code-implementer.md §3 item 4 for both routes.

The general-feature-pr workflow (and its Shopify cousins) remains available for the explicit fits where the script's rigidity is what you want: multi-section parallel fan-out, repeated rounds with a hard ratchet, or structured per-round output for a downstream consumer.

For the full pattern taxonomy (workflow / inline-iterate / pair-driven / persistent reviewer / B→D mode-switch), see iteration-patterns.md.


Universal Work Unit

Every task — bug fix, feature, refactor, infra change — follows the same phases:

  1. CLARIFY — define the problem, success criteria, scope boundaries.
  2. PLAN — write the change description, files to touch, test plan, risks.
  3. GATE/grill-me for ambiguity, otherwise teammate plan-approval.
  4. IMPLEMENT — implementer writes code and tests.
  5. VERIFY — verifier evaluates against checklist; never modifies impl code.
  6. ITERATE — workflow primitive passes verifier feedback back to implementer (max 3 rounds for feature-pr / general-feature-pr).
  7. OUTPUT — implementer raises PR via /raise and monitors via /monitor-pr.

Implementer and verifier are always separate agents. The implementer never verifies its own work; the verifier never writes implementation code (the "Escalating Fixes Outside the Loop" path lets verifiers spawn a transient implementer subagent for post-workflow tactical fixes — see .claude/agents/code-verifier.md).


Code Quality Checklist (cross-language)

  • Fail fast. No silent failures or fallbacks. Raise/throw early at boundaries when validation fails.
  • Model everything. Pydantic models in Python; TypeScript interfaces in TS. No raw dicts or any between functions.
  • Dependency injection. Constructor injection in Python (__init__), Depends() in FastAPI routes. No globals reached from inside business logic.
  • Semantically meaningful errors. Domain error types (RecordNotFoundError, ValidationError); never catch bare Exception unless re-raising.
  • No premature abstraction. Three similar lines is fine. Half-finished abstraction layers are worse than copy-paste.
  • No backward-compat aliases on rename. Update call sites; don't leave old_name = new_name.
  • No comments that restate code. Only add comments for the non-obvious WHY.
  • Stage-appropriate computation (indexing pipeline). For any computed/derived field in the Shopify indexing pipeline, confirm its inputs exist at the stage it is computed. Variant metafields, per-location inventory, and market/country data are enrichment-deferred in the bulk path — absent when transform_product_result runs — so a field depending on them computed in the transformer silently sees nothing and falls back. See Indexing Pipeline Stages. Check the test models the enrichment-deferred shape, not inline-metafield fixtures.

Test Coverage Rules

A change is not done until:

  1. Typecheck passes.
  2. The full test suite passes.
  3. Tests cover the new behavior — new branch points, new error paths, new public functions each get a test that would have failed before the change. "Build green" is not "feature done" — pre-existing tests passing only confirms you didn't break the old behavior, not that you implemented the new one.
  4. Test bodies are clean: setup in fixtures, assertions are explicit, no incidental coupling to implementation details.

For shared services, repositories, or libraries used by multiple components, re-run the consumer test suites to confirm no downstream regression.


Severity Rubric

Use the project's severity rubric for review findings (see docs/dev/orchestration/review-severity-rubric.md):

PriorityDefinitionAction
P0Must fix before merge — broken functionality, data loss, security exposureBlock. Implementer must fix.
P1Serious correctness/reliability issue — should fix unless human approves deferBlock. Implementer must fix.
P2Important quality issue — fix recommended, or capture as explicit follow-upRequest changes.
P3Improvement suggestion — naming, readability, minor refactorsSuggest. Don't block.

Verifier output is structured {pass: boolean, feedback: string}. Feedback is what gets passed to the implementer on the next iteration; include only P0 and P1 items in feedback. P2/P3 are noted but don't drive iteration.


Universal Gates (REPEATED FROM CLAUDE.md)

The root CLAUDE.md lists the actions that ALWAYS require explicit human confirmation regardless of skill / team / mode (Auto Mode does NOT relax these). The most important ones for general dev:

  • PR merge to main / master — humans merge; teammates do not.
  • Force push — never without explicit human ask.
  • Production writes — DDB writes, settings APIs, infra teardowns require human OK BEFORE the call.
  • Destructive gitgit reset --hard, branch deletion, history rewrites.
  • Third-party API writes outside dev cells.

When in doubt, surface the action + diff to the lead via SendMessage and wait.


Workspace Convention

For general dev workflows driven by the general-feature-pr skill, the workspace arg still points at a path under tmp/integrations/<slug>/ even when the slug is not a merchant — the hooks (task-completed.sh, teammate-idle.sh) currently scan only that directory. A future PR may generalize the hooks to tmp/work/<slug>/; until then, use the existing layout. See .claude/skills/general-feature-pr/SKILL.md for the per-skill spawn pattern.


Tooling Reference

Run language-specific build/test commands per the component command registry. Key short-forms:

StackTestFormat
Pythonpants test //components/<name>::ruff format components/<name>/
TypeScriptcd components/<name> && npm testcd components/<name> && npm run format

Reference the root CLAUDE.md for the full language guidelines, including project-specific conventions on error handling, data modeling, and DI patterns.