Judge-based reverse image search — design

Status: Approved (design phase) Date: 2026-06-01 Workers touched: search_proxy (ecom worker), agentic_search (agentic worker), plus the admin/config pipeline (ecom_utils, ecom_settings_exporter, admin_lambda, admin_worker).

1. Motivation

The current LLM-assisted image search (search_proxy/src/llm-assisted-image-search/) has two structural problems:

One scheme is not universally optimal. A single prompt + schema + transform turns the image into one assisted query. Different images are served better by different schemes (brand-filter vs. colorway text vs. plain image), and which is best is not knowable a priori.
Choosing assisted-vs-native by a hit-count threshold is naïve. The current decide() swaps to the assisted branch when it returns ≥3 hits — a heuristic that picks badly for many images.

The image-search-tester POC (ishaaq-pocs/src/pocs/image-search-tester) demonstrates a better general approach: run many candidate-generation schemes in parallel, pool their hits, and let a visual-similarity LLM judge rank the pooled candidates against the query image. This design productionises that approach as a new, separate, opt-in endpoint, leaving the existing /search image path untouched.

We replicate the POC's general approach, not its KC-specific schemes. Which schemes an index uses is admin-configured per index, because different indexes will always want different prompt schemes.

Non-goals / explicit constraints

No backward incompatibility. The existing /search assisted branch and llm_image_search_config are untouched.
A new endpoint, so the unavoidable judge latency does not regress existing /search callers.
Hard cap of 10 results on the new endpoint, to bound judge load.
No caching in v1.

2. Architecture (responsibility split)

Orchestration lives in search_proxy; every Gemini/image-processing call lives in the agentic worker. This mirrors the existing /search → runImagePromptForSearch split and keeps deterministic logic where it is unit-testable.

Stage	Worker	Notes
Auth + settings + config resolve	`search_proxy` (route + new orchestrator module)	Reads `reverse_image_search_config` from KV. Absent/disabled → 404.
Baseline image search	`search_proxy`	Implicit, always-on native tensor image search. Needs no LLM; fired immediately.
Describe calls (1 per LLM-bearing scheme)	agentic worker — reuse `runImagePromptForSearch`	Run in parallel; returns structured fields and the 384px-processed image base64. The orchestrator keeps that processed base64 from a successful describe and feeds it to the judge as the query image, which the judge uses verbatim (no re-fetch, no redundant decode/resize). Because the judge only runs when a scheme contributed (§6 baseline-only skip), a successful describe — and thus this processed image — always exists on the judge path. No change to the RPC.
Build + run N Marqo scheme searches	`search_proxy`	One body per scheme via the transform kinds.
Pool → dedup → cap	`search_proxy` (pure code)	Dedup by `_id`, fair interleave, truncate to `max_pool`.
Judge (K shuffled passes)	agentic worker — new `runImageJudge` RPC	Fetches candidate thumbnails by URL, normalises to 384px, K parallel Gemini passes, returns K raw rankings.
RRF fusion + final order	`search_proxy` (pure code)	Deterministic; RRF determines order, maps `cid → hit`, attaches provenance, top-`limit`. Run-agreement (RBO) is a reported diagnostic only — it does not affect ordering.
Response shaping	`search_proxy`	Marqo-style `{hits}`; judge metadata only under `x-marqo-debug`.

Agentic worker stays minimal — {fetch, normalise, shuffle, K×Gemini, parse}. Logic moves into the agentic worker only if it reduces code or improves latency. Thumbnail fetch + normalisation qualify on both counts (reuse fetchImageAsBase64/processImageBytesForGemini; avoid shipping base64 over the RPC boundary), so the RPC takes candidate URLs, not bytes.

3. Request lifecycle

route
  → resolveReverseImageConfig(KV)           # 404 if absent/disabled; 404 + loud error log if malformed
  → fire in parallel:
       baseline image search (no LLM)
       describe RPC per LLM-bearing scheme
  → as each describe resolves: build + run that scheme's Marqo search
  → barrier: all scheme searches settle (failures drop their scheme)
  → pool → dedup by _id → fair interleave → cap at max_pool
  → if pooled candidates ≤ 1: skip judge, return as-is
  → if no non-baseline scheme contributed any hit (baseline-only pool, e.g. every describe failed
    or every assisted search was empty): skip judge, return the native image-search order as-is
  → else: runImageJudge RPC (K passes)
       → judge ok:   RRF fuse → top-limit
       → judge fail: degrade to deterministic pool order → top-limit
  → if pool empty (all Marqo searches failed): error
  → shape response ({hits}; +debug block under x-marqo-debug)

Concurrency: the baseline never waits on an LLM. Each image_filter/text_query scheme blocks only on its own describe, not another scheme's. Critical path ≈ max(describe) + scheme_search + judge.

4. Configuration data model

4.1 DDB record (one per index)

New record type alongside (not replacing) LlmImageSearchConfigRecord.

pk = system_account_id
sk = INDEX#<index_name>#REVERSE_IMAGE_SEARCH_CONFIG
Runtime fields:
- enabled: bool
- schemes: list[Scheme] — ordered; required & non-empty when enabled
- judge: JudgeConfig
- pool: PoolConfig
- timeouts: TimeoutOverrides | None
Audit fields: created_at, updated_at, updated_by, change_reason (mirrors the existing config record).
Guarded by the existing validate_dynamodb_item_size (400KB). Realistic scheme counts × ~5KB prompts stay well under.

Scheme (admin-assigned identity, transform kind, declarative params):

Scheme = {
  id: str,                       # unique within config; reserved id "plain_image" is rejected
  transform_kind: "image_filter" | "text_query",
  prompt: str,
  response_schema: dict,         # Record<str, {type: "string"|"number"|"integer"|"boolean"}> — the agentic
                                 # worker's LLMResponseSchema shape. Enforced at WRITE time by the Phase 2
                                 # admin/DDB model (malformed shape → 422), with Phase 1's config-resolution
                                 # Zod as read-side defense-in-depth. The contract is hand-mirrored across two
                                 # runtimes (TS read-side, Python write-side) — keep the allowed types in sync.
  fields: list[{ name: str, quote: bool }],   # ordered; meaning depends on kind (§5). The {name, quote}
                                 # shape is enforced at WRITE time (Phase 2 model → 422), mirroring Phase 1's Zod.
  search_config: dict            # opaque Marqo search body; validated only as JSON object
}

JudgeConfig (per-index, code defaults when omitted):

JudgeConfig = { prompt?: str, k?: int }   # defaults: generic visual-similarity prompt, k=3
                                          # k is bounded 1..MAX_JUDGE_K (10)
                                          # the judge's Gemini model is a fixed code constant
                                          # (gemini-2.5-flash-lite), symmetric with the describe step — not configurable

PoolConfig (per-index, code defaults):

PoolConfig = { max_pool?: int }   # default: max_pool = 30; bounded 1..MAX_POOL_SIZE (50)

Per-scheme retrieval depth is derived from the request limit (= limit), not configured: multi-scheme breadth is the diversity mechanism and max_pool still caps the judged union.

Cost-knob bounds. k and max_pool directly drive runaway cost/latency — k is the number of parallel Gemini judge passes, max_pool the candidate count (each an image fetch + judge input). Each carries a hard upper bound (MAX_JUDGE_K=10, MAX_POOL_SIZE=50). A config exceeding any bound is rejected as malformed (warn + dormant 404), not silently clamped — consistent with the fail-loud posture of the rest of the config path, so a fat-fingered admin value (e.g. k: 1000) surfaces rather than quietly driving cost. (Phase 2's admin model and Phase 3's UI should mirror these bounds to reject at write time too.)

TimeoutOverrides (per-index, optional; otherwise code constants):

TimeoutOverrides = { describe_ms?, scheme_search_ms?, judge_ms?, overall_ms? }

All four are enforced by the orchestrator: describe_ms bounds each describe RPC, scheme_search_ms each Marqo search, judge_ms the judge RPC, and overall_ms (opt-in; no effect unless set) races the whole orchestration and throws on exceed.

The numeric fields above (JudgeConfig.k, PoolConfig.max_pool, all TimeoutOverrides.*_ms) are validated at WRITE time as positive integers (Phase 2 model → 422), mirroring Phase 1's Zod z.number().int().positive(). k/max_pool additionally enforce the cost-knob upper bounds above (z.number().int().positive().max(...)); the Python constants MAX_JUDGE_K/MAX_POOL_SIZE are hand-mirrored against config.ts and must stay identical. The write-side check tolerates the integral Decimals DynamoDB returns on read, so a stored record round-trips without a spurious 422.

4.2 KV shape

Key: <system_account_id>-<index_name>#reverse_image_search_config
Value: runtime fields only — { enabled, schemes, judge, pool, timeouts } (audit metadata stays in DDB).
search_proxy parses it with a Zod schema. Malformed → treated as not-available (404 to customer) + loud error log.
Cross-phase contract — scheme response_schema shape. Phase 1's Zod requires each scheme's response_schema to be Record<str, { type: "string"|"number"|"integer"|"boolean", … }> (the agentic worker's LLMResponseSchema): every field maps to an object carrying a valid type. Extra keys on that type-object are tolerated — the read-side uses z.object({ type }).passthrough() and admin_lambda's FieldSchema uses extra="allow", so {brand: {type: "string", description: "…"}} is accepted by both. What is rejected is a missing/invalid type or a non-object value (e.g. {brand: "string"}), which parses in DDB but fails search_proxy's Zod, so the whole config resolves to null and the endpoint stays dormant (404). Phase 2's exporter and Phase 3's admin UI must enforce this same requirement (valid type per field; extra keys allowed). This is the highest-risk seam to keep aligned.
The same seam covers fields, judge, pool, and timeouts. Phase 1's Zod is strict on all of them (fields[] = {name: non-empty str, quote: bool}; the numeric keys are positive integers). Any value the write-side accepts but the read-side rejects silently dormants the endpoint, so Phase 2's model validates these at WRITE time (→ 422) too — the write-side validators are hand-mirrored against search_proxy's Zod and must be kept in sync. search_config is the only sub-object intentionally opaque on both sides (z.record(str, unknown)).

4.3 Exporter changes (`ecom_settings_exporter`)

Add a is_reverse_image_search_config_sk branch in both export_for_records and export_from_records (INSERT/MODIFY/REMOVE), mirroring the existing llm_image_search_config handling: build the KV key + runtime-only value on upsert, bulk_delete on REMOVE.

5. Transform kinds

The orchestrator owns q, limit, and filter; a scheme's search_config is spread first, then q and limit are force-set so admin config can never clobber them. filter is merged, not overwritten: a base filter present in search_config is preserved and ANDed with the scheme's derived filter — (<base>) AND <derived> — so an admin's base scope (in-stock, collection) is honoured. A text_query scheme (no derived filter) keeps the base filter as-is. (The POC force-set filter; merging is a deliberate improvement over it so a configured base scope isn't silently dropped.)

plain_image (implicit baseline, not in the schemes array). Native tensor image search with q = the original full-res image. Always runs. Reserved id plain_image. Guarantees a non-empty pool unless Marqo is down, and a genuine native-image fallback when the judge degrades.
image_filter. Describe → AND each non-empty fields value into a Marqo filter=<field>:(<value>) on an image search (q = original image). Zero non-empty fields → skip the scheme (no per-scheme fallback; the baseline + judge absorb it). Filter primitives (escaping, paren-wrapping) are extracted from the existing llm-assisted-image-search/transforms.ts into a shared util and reused.
text_query. Describe → emit join(" ", [quoteIf(f.quote, llmResult[f.name]) for f in fields if non-empty]), then word-cap; run a lexical/hybrid text search (q = built text). No kind-branching: an admin wanting model-vs-description behaviour configures two schemes and lets the judge choose.

Marqo searches always use the original full-res image; the Gemini describe + judge calls use the 384px-normalised image (mirrors the POC and production processImageBytesForGemini).

6. Orchestrator (`search_proxy`)

New module (e.g. src/reverse-image-search/), separate from llm-assisted-image-search/.

Pooling: dedup by _id (first occurrence wins; record every scheme that surfaced a hit for provenance).
Fair interleave: round-robin by rank across schemes — each scheme's rank-1, then rank-2, … — with the baseline first in every round, so the native-image top hits are always represented. Truncate at max_pool.
Thumbnail resolution: variantImageUrl (code constant for ecom indexes). Candidates without it are dropped from the results entirely (never judged), surfaced in _debug.dropped.unjudgeable.
Judge-skip (≤1 candidate): if the deduped/capped pool has ≤1 candidate, skip the judge and return as-is.
Judge-skip (baseline-only pool): the judge's role is to arbitrate across the scheme-retrieved candidates; if no non-baseline scheme surfaced any hit (every describe failed, or every assisted search came back empty), there is nothing to arbitrate — return the baseline's native tensor image-search order as-is (decision: "skipped_baseline_only"). This is the "describe layer down → fall back to plain image search" degradation. A useful consequence: the judge therefore only ever runs after a scheme contributed, which means a describe succeeded, which means the orchestrator already holds that describe's 384px-processed image — so the judge's query image is always the pre-processed one and is passed through verbatim (no re-fetch, no redundant decode/resize).
Fusion (pure code, lifted from POC): RRF over the K orderings (score = Σ 1/(rrf_k0 + rank), POC rrf_k0 = 60 — distinct from JudgeConfig.k, the judge-pass count), RBO run-agreement (reported diagnostic only), mean per-run score for display, map cid → hit. Top-limit.
Scheme-skip visibility: a scheme that yields zero non-empty fields is skipped (likely a fields[].name typo vs. the keys response_schema actually produces — these are not cross-validated, by design). Emit a structured warn log (scheme id, index) so silent collapse toward baseline-only is observable.
Fail-fast on empty pool: if every Marqo search (including baseline) failed → error (consistent with /search's throw-on-native-failure).

7. Judge RPC (`agentic_search`)

New WorkerEntrypoint method, declared in search_proxy/src/env.ts's AgenticSearchWorkerRPC and implemented in agentic_search/src/index.ts + a focused run-image-judge/ module.

runImageJudge(
  queryImage: string,                       # the 384px Gemini-ready data URI from a successful describe; used verbatim. A non-data-URI throws (fail-fast) — the only caller always supplies the processed data URI
  requestId: string,
  candidates: Array<{ cid: string; imageUrl: string }>,
  judgeConfig: { prompt: string; k: number },   # the judge's Gemini model is a fixed code constant, not passed in
) → Response   # { rankings: Array<{ ordering, scores, notes, queryId }>, unfetchable: string[] }

Implementation = {fetch + normalise candidate thumbnails to 384px, use the pre-processed query data URI verbatim (a non-data-URI query throws — fail-fast, since the orchestrator always supplies the processed data URI), K parallel shuffled passes, parse}. Per-pass schema is the POC's rankingSchema(candidateIds) (id enum-constrained to the candidate set). Un-fetchable candidates are dropped and reported. No fusion in the agentic worker — it returns the K raw rankings. The request/response shapes are a shared type referenced by both sides (see §13).

8. Endpoint contract (`search_proxy`)

Route: POST /api/v1/indexes/:index/reverse-image-search, behind directAuthAndSettingsReadOnlyMiddleware (same auth/settings as /search).
Request: { q: <image URL or base64 data URI>, limit?: number, filter?: string, attributesToRetrieve?: string[] }. q accepts both forms exactly like Marqo's native image search (isImageQuery recognises both; the agentic worker handles both). limit clamped to [1, 10] (numeric strings from untrusted JSON are coerced). filter (if present, must be a string — else 400) is ANDed into every search the orchestrator issues — baseline and each scheme — so a caller can scope a single request and the skipped_baseline_only result stays in scope. attributesToRetrieve (if present, must be an array of strings — else 400) projects the returned hits; Marqo response metadata (_id, _score, …) is always retained.
Field projection / thumbnail preservation: the customer-facing projection is enforced in two independent layers, and the internal pooling searches must defeat both so the judge can always read the thumbnail:
1. Client-side — performMarqoSearch strips fields inline on every call (convertDocumentsForEcomOutput → removeInternalFields, applying allowed_fields/omitted_fields), not as a separate outward step. The orchestrator retrieves the raw document by passing performMarqoSearch's field-retention flag (debugMode's only effect is to skip this strip).
2. Marqo-side — buildSearchRequestBody/defaultBodyValues propagates settings.search_config.attributesToRetrieve (or a scheme's) and Marqo enforces it server-side, a layer the client-side flag cannot reach. So the orchestrator force-adds THUMBNAIL_FIELD (variantImageUrl) into outBody.attributesToRetrieve when a projection is set (an unset list returns everything), mirroring how defaultBodyValues force-adds parentProductId. The customer projection is then reapplied once, on the way out, via the shared projectSearchHits helper in search.ts: allowed_fields/omitted_fields (skipped when the request is in debug mode) and then the request's attributesToRetrieve, always keeping Marqo response metadata. This is why an index whose allowed_fields or search_config.attributesToRetrieve omits the thumbnail still judges correctly while never surfacing a field the caller didn't ask for.
Response (non-debug): { hits: [...], limit, processingTimeMs }. hits are native Marqo documents in judge-ranked order (or the native baseline / deterministic pool order in the documented skip and degrade cases — §6, §9) — shape-compatible with /search so existing client rendering works unchanged. No q echo, no Marqo passthrough envelope. (processingTimeMs here is the proxy's end-to-end wall time, not Marqo's internal time as on /search.)
Response (debug, x-marqo-debug only): a single top-level _debug block (no per-hit detail in v1): { decision, pool_size, run_agreement, per_scheme: [{id, describe_output?, built_query_or_filter?, hit_count, error?}], dropped: {failed_schemes, unjudgeable, unfetchable} }. decision is "judged" | "degraded_pool_order" | "skipped_single_candidate" | "skipped_baseline_only". dropped.unjudgeable are pooled candidates with no thumbnail and dropped.unfetchable are candidates the judge could not fetch — both are excluded from hits (a visual-similarity result only returns candidates the judge actually assessed), not appended unranked.
Edge behaviour:
- q not an image → 400 (ClientError).
- filter present but not a string, or attributesToRetrieve present but not an array of strings → 400 (fail-fast, before config resolution).
- config absent or enabled: false → 404 ("reverse image search not enabled for this index").
- config present but malformed → 404 to the customer + loud error log (fail-fast in observability, graceful in customer response).
Latency envelope: this endpoint deliberately trades latency for quality; clients should expect a worst case of ~20–25s and set HTTP timeouts accordingly. The endpoint docs publish this.

9. Failure & fallback (Option 3)

A failed/timed-out describe drops only its scheme(s).
A slow scheme search drops out rather than holding the barrier.
A failed/timed-out judge → degrade to deterministic pool order (decision: "degraded_pool_order").
A baseline-only pool (no non-baseline scheme contributed any hit) → skip the judge, return the native image-search order (decision: "skipped_baseline_only", §6). The baseline inherits the index-level settings.search_config.filter (via setFilter, like every search) and the request filter, so the returned items stay in scope. It does not inherit a per-scheme reverse-image search_config.filter (that scope is scheme-specific by design); global scope belongs in settings.search_config.filter or the request filter.
Empty candidate pool (all Marqo searches failed) → error (not an empty 200), consistent with the existing fail-fast /search behaviour.
Timeouts are code constants, overridable per-index via TimeoutOverrides. Defaults: each describe RPC is bounded by an orchestrator-level describe_ms (default 15s — above the agentic worker's own IMAGE_FETCH_TIMEOUT_MS=1500 + LLM_TIMEOUT_MS=10000 internal budget, so it only fires on a transport stall); scheme_search_ms 5s; judge_ms 15s. overall_ms is opt-in (no default) — when set it races the whole orchestration and throws on exceed. Worst case ~20–25s with graceful degradation at every stage.
Upstream/topology: search_proxy is a Cloudflare custom-domain edge worker with no gateway in front (the AWS WAF rate rules guard the Monolith, not this worker), so the only constraints on overall request duration are the Cloudflare Worker wall-clock limit and the client's own timeout. Verify at build time that the CF Worker wall-clock limit comfortably exceeds the overall ceiling.

9.1 Observability

No new metrics infrastructure. The endpoint emits the standard per-request MetricEvent via the existing enqueueRequestMetric (giving endpoint, accountId/indexName, status, and overall latency for free, like every other route). All richer per-request detail (decision, pool_size, per-scheme, judge-pass meta) lives in the x-marqo-debug block (§8) — the established house pattern for exposing internals. Operational anomalies (scheme skips, malformed config, judge degradation) surface as structured warn logs (mirroring configs.ts's warn-on-invalid-KV precedent), not bespoke metrics.

10. Caching

None in v1. Base64 uploads are effectively unique per request (no cache hits possible); the multi-Gemini cost is the consciously-accepted price of a separate opt-in endpoint. Revisit a targeted URL-keyed cache later only if repeat-URL traffic proves material.

11. Admin API (`admin_lambda`)

CRUD mirroring llm_image_search_config_routes.py:

GET/PUT/DELETE /{system_account_id}/indexes/{index_name}/reverse-image-search-config
New request/response models, repository, and DI wiring.
Import/export roundtrip updated in the same change (per admin_lambda/routes/CLAUDE.md's paired-contract rule): export in /export/all, apply in /import/all, and update the roundtrip test in components/shopify/e2e_tests/.../ecom_import_export_test.py.

12. Admin UI (`admin_worker`)

In v1. A sibling app/components/tabs/reverse-image-search/ following the existing llm-image-search/ pattern (Section + Edit-modal + hook + api.ts + types.ts), wired into ConfigurationTab.tsx.

Structured inputs for id, transform_kind (dropdown), prompt (textarea), fields ({name, quote} rows).
Raw JSON textareas for response_schema and search_config, validated as JSON. Unlike the LLM-assisted modal, response_schema is not permissive here: Phase 1's KV reader requires every field to map to an object carrying a valid type ("string"|"number"|"integer"|"boolean"), so the UI must enforce that (hard error, not soft warning) — a saved config that doesn't match fails resolution and leaves the endpoint dormant (§4.2). Extra keys alongside type are allowed (the read-side uses .passthrough(); admin_lambda uses extra="allow"), so the UI must not reject them — rejecting a config both runtimes accept would block legitimate saves.
Judge/pool block: prompt textarea + numeric inputs for k and max_pool, optional timeout overrides. (The judge model is a fixed code constant and per-scheme retrieval depth is derived from the request limit — neither is operator-facing.)
Schemes are an add/remove/reorder list; the implicit plain_image baseline is shown as a non-editable row.

13. Testing

search_proxy (Vitest): transform kinds (filter build incl. base-filter merge, text build, quote-safe word cap, skip-on-empty); pooling/dedup/fair-interleave/cap incl. synthetic keys for _id-less hits; RRF + RBO fusion incl. empty input and failed-pass orderings; judge-skip (≤1 and baseline-only, incl. describe-failure, empty-scheme-search, and no-usable-fields); degrade-on-judge-failure; per-step describe_ms timeout dropping a scheme; empty-pool error; per-scheme retrieval depth derived from the request limit (baseline + every scheme search run at limit === request.limit); KV config incl. malformed-response_schema rejection and out-of-bounds cost-knob rejection (k/max_pool above their caps); endpoint edge codes (400 for bad q/filter/attributesToRetrieve, 404); response shaping (debug vs non-debug); limit clamping + string coercion; opt-in overall_ms budget; drop-unassessed-from-results; request-filter AND into every search body; internal field-stripping disabled so the thumbnail survives allowed_fields; attributesToRetrieve projection of returned hits.
agentic_search (Vitest): runImageJudge — thumbnail fetch failures dropped/reported, K-pass parsing, schema enforcement, candidate normalisation, verbatim reuse of a pre-processed query data URI (no redundant decode), and throw-on-non-data-URI query.
ecom_utils / ecom_settings_exporter / admin_lambda (pants): DDB model validation (enabled requires schemes; reserved-id rejection; item-size guard); exporter INSERT/MODIFY/REMOVE → KV; admin CRUD; import/export roundtrip.
admin_worker: component tests for the schemes form (add/remove/reorder, JSON validation warning).
Cross-worker contract (no integration harness): the runImageJudge request/response shapes are a shared type referenced by both the env.ts RPC interface and the agentic implementation; the new endpoint's tests type the mocked binding against that signature (no as any), so shape drift fails at compile time. This replaces a net-new cross-worker integration harness in v1 (none exists today — the current convention mocks the binding via as any).

14. Scope

In v1: new endpoint + orchestrator (search_proxy); runImageJudge RPC (agentic_search); DDB model + repo (ecom_utils); exporter branch (ecom_settings_exporter); KV-read schema (search_proxy); admin CRUD + import/export roundtrip (admin_lambda); admin UI (admin_worker).

Deferred: caching; making max_pool-style knobs anything beyond the per-index config already specified; any non-ecom thumbnail-field generalisation (ecom standardises on variantImageUrl).

1. Motivation​

Non-goals / explicit constraints​

2. Architecture (responsibility split)​

3. Request lifecycle​

4. Configuration data model​

4.1 DDB record (one per index)​

4.2 KV shape​

4.3 Exporter changes (ecom_settings_exporter)​

5. Transform kinds​

6. Orchestrator (search_proxy)​

7. Judge RPC (agentic_search)​

8. Endpoint contract (search_proxy)​

9. Failure & fallback (Option 3)​

9.1 Observability​

10. Caching​

11. Admin API (admin_lambda)​

12. Admin UI (admin_worker)​

13. Testing​

14. Scope​