Facet Rules Merchandising — Phased Implementation Plan
Companion to the Facet Rules Merchandising spec (Notion, owner: Joshua) and its "Ishaaq's Plan Divergences" section. This document sequences the build into independently-shippable phases. Where this plan diverges from Joshua's spec, it says so and points at the divergences table; everything else follows the spec as written.
0. Decisions this plan is built on
The spec's design is adopted almost wholesale. Three things were corrected against the live code, and three are genuine divergences recorded in Notion. Everything else (controller placement, single global list, live catalog discovery, client-facets-win carve-out, limits, MVP seed, 5-min KV cache, parallel KV fetch, audit fields) is taken from the spec unchanged.
Code-grounded corrections (not design changes)
| Spec wording | Reality in the repo | Consequence |
|---|---|---|
| Persistence as a generic "controller CRUD" | components/controller/merchandise/ is Django REST Framework — ViewSets + urls.py + authentication_classes/permission_classes, services built in ViewSet.__init__. | Mirror the DRF merchandise component (no FastAPI Depends()/injector). |
| Optimistic locking, unspecified mechanism | The record_version lock in synonym_service.py reads :cv server-side and never round-trips it through the client, so it only guards the in-request read→write window — not the load→edit→publish gap a real lost-update needs (merchandising_service's put_view doesn't lock at all). | Facets is last-write-wins (§4); keep only the create-path attribute_not_exists guard, drop the ineffective update-path version condition. The shared synonyms fix is deferred. |
| "Pattern reference #2639" | Confirmed: facet rows belong in MERCHANDISING_TABLE; merchandising_exporter already scans it and serialises into the per-{account}-{index} KV blob the proxy reads. | The single KV blob (Decision 6) is the export channel; the proxy reads it via the existing resolveProfileOverride machinery. |
The three divergences from Joshua's spec (see Notion divergences table)
- Draft → Publish is UI-only. Drafting lives in client state (the
pendingView/currentViewhouse pattern); nothing persists until an explicit Publish, which writes the live row. No backenddraft/publishedseparation. - Proxy enforcement is request-side only. The proxy pushes field selection + pin order into the upstream request and Marqo returns facets already ordered (matching existing product-pin enforcement). No response rewrite. Depends on the in-progress Marqo facet field pin/order capability.
- The superset stores typing, and is customer-editable + extendable. Joshua's Decision 3 keeps the superset id-only and has the UI join the live catalog for type on every render. Instead the superset persists
{ fieldName, type }(type ∈ {text, numeric, array, boolean}). Seeding detects the type from aq='*'&limit=10sample (Marqostring→text,number→numeric,array→array,boolean→boolean, internal fields dropped), but the merchant can override a mis-detected type and can add facet fields the discovery sample never surfaced (arbitrary field name + chosen type). The persisted type is the single source of truth that flows superset → exporter → proxy, so the type-aware facet request needs no re-resolution at request time.
1. Data model (frozen for all phases)
Three rows per index in MERCHANDISING_TABLE, pk = "{systemAccountId}#INDEX#{indexName}" — identical keying to the ranking-rule rows. Each row carries audit fields (createdAt/By, updatedAt/By). Writes are last-write-wins (§4): the create path guards a lost first-write race with attribute_not_exists(pk), but the update path carries no client-round-tripped version lock. No draft/published fields — the live row is the published state; drafting is UI-only.
sk = "FACET_SUPERSET" # index-wide candidate pool
fields: [{ fieldName, type }] # type ∈ {text, numeric, array, boolean}; seeded from discovery, customer-editable (divergence 3)
sk = "GLOBAL_FACETS" # index-wide pin order (one list, not per-context — Decision 2)
enabled: bool # merchant on/off intent (D5); exporter translates to presence/absence — NOT exported as a flag, NOT read by proxy
onboarded: bool # UI onboarding-banner state
facets: [{ fieldName, pinned, position? }] # dense 1..N for pinned; absent for dynamic
sk = "FACET_OVERRIDE#{context}#{trigger}[#{profileId}]" # per (context, trigger); context ∈ {collection, search}
context, trigger # top-level attrs, mirroring MerchandiseViewRecord
facets: [{ fieldName, pinned, position? }]
- The superset is the typed candidate pool. Each entry is
{ fieldName, type }withtype ∈ {text, numeric, array, boolean}.typeis seeded from the discovery sample but is merchant-editable (override a mis-detection), and the merchant may add fields the sample never surfaced (arbitraryfieldName+ chosentype). The persistedtypeis the single source of truth — the exporter carries it into the KV blob and the proxy uses it to build the type-aware facet request (no re-resolution at request time). - Both
GLOBAL_FACETSand overrides validate theirfacetsagainst the sharedFACET_SUPERSET— the superset is the single candidate pool; nothing (global or trigger) may reference afieldNameoutside it. Pin entries carryfieldName+pinned/positiononly and look uptypefrom the superset. A trigger override independently picks/pins from the full superset, so it can surface a field the global list omits — but cannot introduce one the superset lacks. NoeligibleFields. profileIdis reserved in the overridesknow (Decision/OQ-5) so A/B profile scoping drops in the house way later — default overrides omit it.- Limits (Decision 8): superset ≤ 100; pinned per global / per override ≤ 100; overrides per (index, context) ≤ 100.
Two switches: entitlement vs runtime
- Account entitlement = the
facet_merchandisingflag (done in #3708): gates UI visibility + write authorization, fail-closed. Turned on operationally via the per-account exception list — the Rollout step. Effectively one-way: once an account is entitled and using the feature, we never revoke entitlement at runtime — doing so would regress a live customer. So entitlement-off only removes new UI access; it never tears down exported config or running behaviour. - Per-index runtime =
GLOBAL_FACETS.enabled, the in-UI Settings toggle (D5). It's the merchant's on/off intent, persisted in DDB; edits still persist when off. It is not read by the proxy — it only controls whether the exporter emits config. - The proxy gate is client opt-in + presence. Merchandised facets apply only when the request carries
useDynamicFacets: true(a new boolean on the/searchand/collectionsAPIs) and facet config is present in the KV blob; otherwise pass through unchanged.useDynamicFacetsis mutually exclusive with a client-suppliedfacets(passing both is a 422), and it's not forwarded to Marqo. Like the existingfacetsparam, it requires a HYBRID search — a non-HYBRID request that opts in is rejected by Marqo exactly asfacetsis today (an SE-config responsibility, not new behaviour). The proxy reads neither entitlement norenabled— the exporter translates the merchant'senabledtoggle into blob presence/absence; entitlement never reaches the blob. (Decision: opt-in supersedes the original "pure presence, auto-apply" model — auto-applying facets to every request injects them onto wildcard/browse/TENSOR traffic the customer never asked to facet, which Marqo rejects; the explicit param scopes facet application to where the customer intends it.)
2. Phases
Search behaviour changes only at Phase 5. Everything before it is dark behind the fail-closed facet_merchandising flag. P1 and P2 are independent and can run in parallel.
Phase 1 — Console draft/publish UX (against the existing mock seam)
Goal: adopt the house edit→publish pattern in the facet-rules UI, decoupled from real persistence via the FacetRulesStorage seam — no backend needed.
- Mirror
slices/merchandise+SaveChangesButton.tsx: acurrent/pendingpair, a change-count selector (à laselectNumMerchandiseChanges), a "Publish" button gated on the count + an "Undo Changes" button. - Remove the 300 ms autosave effect in
FacetRulesProvider; edits mutatependingonly; Publish callsstorage.save(pending)(still the in-memory mock at this phase), thencurrent = pending. Drop thePREVIEWchips and the "admin_lambda CRUD" tooltip; rewrite the disabled-state copy. - The Settings enable/disable toggle (
global.enabled, D5) is a config field like any other — it mutatespendingand takes effect on Publish, surfacing the existing "disabled in Settings — rules saved but not applied" warning when off. - Tests (RTL): edits mutate
pendingonly; Publish disabled when clean; Undo resets tocurrent; Publish persists through the seam. - Exit: the draft/publish UX is reviewable and demonstrable behind the flag against the mock. No backend, no search effect. No dependencies.
Phase 2 — Backend: rows + service + CRUD API
Goal: the three row models, the locked service, and the DRF endpoints — they go hand-in-hand, so one phase.
- Row models in
merchandise/services/models.py(dataclassMixin, camelCase,to_ddb/from_ddb,floats_to_decimals). merchandise/services/facet_rules_service.py: per-rowget/upsert/delete;attribute_not_exists(pk)guards create against a lost first-write race; updates are unconditional (last-write-wins — §4); bump the#staticrow on every write so the exporter re-runs.- Validation: superset dedupe by
fieldName+ every entry'stype ∈ {text, numeric, array, boolean}+ ≤100; global/overridefacets→ everyfieldNamein the superset (the single shared candidate pool),pinned ⇒ position, dense1..N, ≤100; trigger resolved via_resolve_synonym_triggerwith 409-on-resolved-collision. - API: extend
MerchandiseViewSet/ register inurls.pyexactly as the spec's Controller API section:facet-superset,global-facets,facet-overrides(list + per-trigger PUT/GET/DELETE). Addput_facet_*to_SERVICE_AUTH_ACTIONS. EnforceFeature.FACET_MERCHANDISINGon every route (fail-closed); permissions matchMerchandiseViewSet(Decision 9). - Tests (gating): create→get round-trip per row (superset round-trips
fieldName+type); a concurrent first-write create race is rejected (attribute_not_exists); an override can pin a superset field the global list omits; PUT rejects afieldNameoutside the superset; PUT rejects an invalidtype; validation rejections; 409 collision surfaces the existing trigger; authed CRUD per endpoint; entitlement-off → blocked. - Exit:
pants test //components/controller/merchandise::green; API usable via curl; flag off ⇒ invisible. No dependencies. - Scope note: the row stores
type, but the live index-touching surface that detects it —facet-fieldsdiscovery (theq='*'&limit=10sample), the per-triggerprobe, and seed-on-first-enable — is not in P2. It needs live index/Marqo access and the OQ-1/OQ-2 confirmations, so it lands in P6 with the real-catalog work. In P2/P3 thetypeis supplied by the caller (mock catalog or the customer's own choice); first enable here just creates the rows the UI writes; nothing is pre-populated from the index.
Phase 3 — Console: real storage wiring
Goal: swap the mock seam for real persistence — the draft/publish UX from P1 is unchanged.
api/facetRules/api.ts+thunks/facetRules.thunk.ts(axios +createAsyncThunk, Zod-parsed), one call per row endpoint. ImplementFacetRulesStorageagainst them; deletegetDefaultFacetRulesStorage/makeMockDynamoFacetRulesStorage/INITIAL_MOCK_STATE.- Superset
typeround-trips through the row — reload is self-contained for the functional fields (fieldName+type); the mock catalog is consulted only for volatile decoration (distinctcount) and the live-presence/"missing" badge. The customer can edit a field'stypeand add fields not in the candidate list (freefieldName+ chosentype); both persist as ordinary superset entries. - Index-scope the provider — thread
indexNameintoFacetRulesProviderat all three mounts (MerchandisingSettings.page.tsx,FacetsTab.tsx,GlobalFacets.page.tsx); Publish PUTs the live row(s), then refetch. - Tests: provider load/save against a mocked thunk; index-scoped keying; Publish dispatches the real PUT; superset
typeedit + custom field add persist and reload. - Exit: UI persists per
(account, index), survives reload. No search-path effect yet. Depends on P1 (UX) + P2 (API).
Phase 4 — Exporter: publish → KV blob
Goal: live facet rows reach the proxy via the existing single KV channel.
- Extend
merchandising_exporterto scansk ∈ {FACET_SUPERSET, GLOBAL_FACETS, FACET_OVERRIDE#*}and serialise into the existing{systemAccountId}-{indexName}blob: top-levelfacetSuperset(carrying each field'stype, so the proxy needs no re-resolution),globalFacets, andtriggers[md5].facetOverride. Reuseget_trigger_hashbyte-for-byte. - Honour the runtime toggle by presence: when
GLOBAL_FACETS.enabledis false, omit the facet keys entirely — the proxy gate is presence, so absence = pass-through. The exporter does not gate on account entitlement (never revoke a live customer) and does not emit anenabledflag. - Extend
fork_routes.py's sk allowlist to copy the three new sks (Decision: lifecycle); index delete already rides along via the pk wipe. - Tests: blob shape exact; trigger-hash keys match the proxy's
getMerchandisingHashKey; empty/absent ⇒ proxy pass-through; fork copies facet rows. - Exit: publish → blob carries facet config. Still inert (proxy not yet reading it). Depends on P2 shapes.
Phase 5 — Proxy enforcement (the only phase that changes live search)
Goal: ON indexes serve the merchandised facet set, request-side.
- In
search.ts/merchandising_overrides.ts: from the already-fetched merch blob, resolve the effective facet set (triggers[md5].facetOverrideelseglobalFacets) and build the upstreamfacetsrequest —fields[fieldName] = { type }(type fromfacetSuperset, already Marqo vocab), plus per-field pinning + dynamic ordering per the Marqo facets contract (marqo-internal docs/specs/search/facets.md, 2.28.1): a pinned field carries a 0-basedposition(controller stores 1-based dense, so the proxy subtracts 1), andcrossFieldOrderingSettings: { dynamic: true, engagementField }divergence-ranks the unpinned fields with pins overlaid on top.engagementFielddefaults to_pixel_four_week_click_count(matchesDEFAULT_POPULARITY_SETTINGS); a per-index override via index settings is a deferred fast-follow. Remap the boost/burypagecontext to the facetcollectionvocabulary before the override md5 (the exporter hashes overrides asmd5("collection|trigger")). No response rewrite. - Opt-in trigger: facets are built only when the request sets
useDynamicFacets: true— not auto-applied. A non-HYBRID request that opts in 422s on Marqo, identical to the existingfacetsparam (confirmed on staging: omittedsearchMethoddefaults to TENSOR → 422; explicitsearchMethod: HYBRID→ 200). The proxy does not force HYBRID — sending facets on a non-HYBRID surface is the same SE-config mistake it already is forfacets. - Carve-outs (spec Decisions 5 & 7): no
useDynamicFacets: true⇒ no merch facets; no facet config in the blob ⇒ pass through unchanged (D5 runtime OFF, by absence);useDynamicFacets: true+ explicitfacets⇒ 422 (mutually exclusive, validated up front). The proxy reads neither the controller feature flag nor anenabledfield. - Tests (
merchandising_overrides.test.ts+search.test.ts): facet config present ⇒ merchandisedfacets.fields(typed) + 0-based pins +crossFieldOrderingSettingson the request; config absent ⇒ pass-through (byte-identical to today); client facets ⇒ untouched; below-threshold ⇒ omitted; per-trigger override beats global, with thepage → collectionremap exercised. Cross-language contract: no shared fixture file — the collection-override md5 (md5("collection|mens-shoes")) is pinned to the same literal in both the proxy test (merchandising_overrides.test.ts) and the exporter test (test_cloudflare_kv_store_client.py::TestFacetExport), with cross-reference comments, so a hash/context-vocab change fails a test on each side (see the §4 revision). - Exit:
npm test+npm run lintgreen (tsc has pre-existing errors — not a gate). Depends on P4 and the Marqo capability — both now landed (facets contract shipped in Marqo 2.28.1: per-fieldposition+crossFieldOrderingSettings).
Phase 6 — Live surface: catalog, probe, seed
Goal: build the live index-touching endpoints and retire the last mocks.
- Real catalog (supersedes Decision 3 / D9 — see divergence 3): the
GET /merchandise/facet-fields/{index}?apiKey=...endpoint runsq='*'&limit=10against the index, walks the sampled hits, drops internal fields via the existingis_base_fieldrule (rejects^_and dotted keys — equivalent to theis_internalqualifier used inadmin_worker), and detects each field'stype(bool→booleanchecked beforeintbecause of Python's subclass relation;int/float/Decimal→numeric;str→text; non-emptylist→array; empty list defers to a later sample). The detectedtypeonly seeds the superset entry: the merchant can override it, and can add fields the sample never surfaced (freefieldName+ chosentype). Keep the "missing" badge for configured-but-absent fields. - Real probe (Decision 4 / D8):
GET /merchandise/facet-probe/{index}/{context}/{trigger}?apiKey=...issues one Marqo search withlimit=1andfacets={fields: <every persisted superset field>}. Marqo computes facets via a separate Vespalimit 0 | <grouping>query that runs over the full match set regardless of the hits-sidelimit(OQ-1 resolved —semi_structured_vespa_index.py:_generate_facet_queries), so a single round-trip is enough. Per-field popularity = sum of bucket counts (numericuses the statscount). Returns superset fields ranked desc by popularity, ties broken byfieldNameasc. Search-context triggers are synonym-resolved; collection triggers go through the/collectionspath with the collection name. Classic-vs-Ecom collection scoping: the Ecom backend scopes server-side viacollectionNameon/collections; Classic Marqo has no collections concept, so the view builds an explicit<contexts[PAGE].trigger_list.source>:(<trigger>)filter and passes it through (mirrorsMerchandisingService.get_ranking_modifiers's pattern). Live per drawer open with no controller-side cache — the ecom search proxy already caches/searchand/collectionsresponses at the edge (CacheableEndpointincache-config.ts) keyed by URL + body, and the probe body is deterministic per(account, index, context, trigger, superset), so the existing edge cache absorbs repeated drawer-opens. A controller-layer cache would double-cache for no gain (OQ-5 resolved). - Seed-on-first-enable (Decision 11):
POST /merchandise/facet-superset/{index}/seed?apiKey=...runs the same discovery sample, filters totext+arraytypes (every text / text-array field is lexically searchable in a Marqo index — OQ-2 resolved), and creates theFACET_SUPERSETrow only if none exists. Idempotent: subsequent calls return the existing row withseeded: false; concurrent first-seed surfaces as aConcurrentModificationErroragainst the create'sattribute_not_exists(pk)condition and the loser returns whichever row won the race. First-PUT/enable logic lives inFacetRulesService.seed_superset_if_absent, not a migration. Cognito-only — deliberately NOT in_SERVICE_AUTH_ACTIONS: the console drives the first enable, and the merchandise-service fan-out path couldn't supply the customer's MarqoapiKeyanyway. - Pre-rollout: atomic backend publish endpoint (combined backend + console). A single Publish previously fanned out N independent row writes (superset + global + each override) from the console via
Promise.all, and every write's backend transaction alsoPutthe shared#staticrow — so a lone publish self-contended on#staticand intermittently surfaced a spuriousConcurrentModificationError(moto hides it; see §4 "Publish is atomic"). Two concurrent publishers from the same account also raced even if the console serialised within a single publish. Fix (this PR): a new controller endpointPOST /merchandise/facets-publish/{indexName}accepts the whole desired facet-rules state (superset + global + every override) and writes it in onetransact_write_items— all rows plus a single#staticbump — withattribute_not_exists(pk)applied only on rows that didn't already exist (the create-race guard). Override rows present in DDB but absent from the payload are deleted in the same transaction. The console drops its multi-callPromise.allflush entirely and calls the new endpoint once. This supersedes the cheaper "serialise the console writes" workaround, which only hid a single self-racing publisher and still broke under two concurrent publishers. Hard pre-rollout gate before any account is entitled — unreachable behind the fail-closed flag until then. - Tests (P6 backend): type detection covers all four types and the bool-before-int precedence + empty-list deferral; internal-field filter drops
_pixel_*/_merch_*/dotted keys; popularity aggregator sums bucket counts for string/array/boolean and usescountfor numeric; probe ranks desc + tie-breaks name-asc; probe with no superset is 409; seed creates from text+array only; seed is idempotent and loses the create race cleanly; seed under service-auth POST is rejected (regression guard against accidentally restoring the grant). - Tests (atomic publish): publish creates all rows in one transact; LWW on update with
createdAtaudit preserved; orphan-reference + duplicate-trigger + over-limit + oversized-transaction rejected up front with nothing landed; deletes overrides absent from the payload; search-context synonym resolution; concurrent first-create race surfaces as a 409; service-auth fan-out lands on the explicit target account; entitlement-gate 403. - Console mock retirement (this PR).
FacetRulesProviderfetches the catalog on mount via the livefacet-fieldsendpoint;MOCK_INDEX_CATALOGis removed.makeControllerFacetProbereplacesmakeMockMarqoFacetProbe— empty trigger short-circuits to the current snapshot (preserves the new-override drawer's open-with-no-trigger flow), non-empty trigger hits the livefacet-probeendpoint. The seed endpoint is dispatched on the enabled false→true transition (idempotent, gated on the local superset being empty so subsequent toggles don't re-call).FacetOverrideDrawercatches probe failures so a 409 on an unseeded superset doesn't wedge the drawer. - Exit: P6 backend endpoints + atomic publish endpoint land green; the console publish flow goes through the new endpoint (single
transact_write_items, one#staticbump); the console-side mocks are retired (MOCK_INDEX_CATALOG,makeMockMarqoFacetProberemoved). Depends on P3 & P5.
Rollout (post-P6)
Internal QA on staging (flag already true there) → per-account exception list for one design partner → gradual prod default flip (Decision: feature flag rollout).
Pre-rollout gates (hard, before any account is entitled):
- Production Marqo ≥ 2.28.1 for the indexes being enabled. P5 emits per-field
position+ top-levelcrossFieldOrderingSettings, which ship in Marqo 2.28.1. Confirm with the Marqo team that each index's deployed version speaks 2.28.1 before entitling its account. Validated against staging; production version is the open item. (The broad-traffic 422 risk is gone with theuseDynamicFacetsopt-in — facets now only attach when the customer asks; what remains is the same HYBRID contract the existingfacetsparam already carries, i.e. an SE must pointuseDynamicFacetsat HYBRID surfaces, just as forfacets.) - Confirm
engagementField-absent behaviour for non-pixel accounts. The cross-fieldengagementFielddefaults to_pixel_four_week_click_count, hardcoded even for accounts without pixel data. The facets spec says a cold-start / zero-engagement field falls back to request order (graceful, not an error), but verify on staging that anengagementFieldabsent from every document degrades cleanly (no 422, sensible order) before entitling any non-pixel account. - Atomic backend publish endpoint shipped (P6, combined backend + console). See §4 "Publish is atomic" — folds the per-row writes into one
transact_write_itemscovering every row + a single#staticbump; console calls the new endpoint instead of fanning out per-row PUTs. Required because two concurrent publishers on the same account still race the#staticbump under any console-only mitigation.
3. Dependency graph & sequencing
P1 (draft/publish UX) ──┐
├──► P3 (real storage) ──┐
P2 (rows + CRUD API) ───┤ ├──► P6 (catalog/probe)
└──► P4 (exporter) ─► P5 (proxy) ──┘
▲
Marqo facet pin/order capability (external)
- P1 and P2 are independent — UX (against the mock seam) and backend can run fully in parallel.
- P3 (real storage swap) needs both: the UX from P1 and the API from P2.
- Parallelizable once P2 freezes the row shapes: P4 (exporter) and P5 (proxy, TS, against frozen blob fixtures).
- Single risky cutover: only P5 changes live search, and it stays inert until P4 exports config — so the behaviour change can be staged/rolled back at the export step, not the code step.
- External gate: P5 also waits on the Marqo facet field pin/order change (colleague-owned). Nothing else does.
4. Cross-cutting risks
- Marqo capability dependency. Request-side ordering assumes Marqo will honour configured facet field order. If it ships differently than assumed, only P5 is affected — revisit enforcement (a scoped response rewrite is the fallback), nothing upstream.
- Concurrency is last-write-wins (revised). P2 originally modelled an optimistic lock on
synonym_service, but that pattern readsrecord_versionserver-side and never round-trips it through the client, so it only guards the in-request read→write window — not the load→edit→publish gap a real lost-update needs. Rather than build the full client round-trip for facets alone (which would also need a transactional publish to be meaningful — see below), facets is explicitly last-write-wins, matching the coreput_viewrules path. The update path carries no version condition; only the create path keepsattribute_not_exists(pk)(a genuine first-write-race guard, also exploited by P6seed_superset_if_absentand the atomic publish's per-row create guard). The sharedsynonym_service/facets pattern is a deferred fix — revisit real optimistic concurrency if/when multi-editor contention becomes a concrete problem. - Publish is atomic (this PR). Before this PR a Publish issued several independent row writes from the console (superset first, then global + the per-context override reconcile via
Promise.all); each row write's backend transaction alsoPutthe shared#staticrow, so a single publish firing them viaPromise.allcould hit a DynamoDBTransactionConflict→ConcurrentModificationError("modified concurrently, retry") intermittently (moto serialises transactions so the suite didn't catch it; surfaced by review on PR #3806), and even with the console serialising within a single publish, two concurrent publishers on the same account still raced the#staticbump and silently interleaved their row writes (publisher A's superset, B's global, A's override 1, B's override 2, …). The atomic publish endpoint (P6 pre-rollout) writes the whole desired state — superset + global + every override + a single#staticbump — in onetransact_write_items, withattribute_not_exists(pk)applied only where the snapshot showed no existing row (the create-race guard). Updates are last-write-wins. The console pre-validates the draft (every field referenced by global/overrides must be in the superset) before the publish call for an in-rule error message; the server re-validates and rejects identically. Override rows present in DDB but absent from the publish payload are deleted in the same transaction. Cross-publisher conflicts are not eliminated — they're made recoverable. Two concurrent publishers on the same(account, index)still both touch the shared#staticitem, and DDB serialises on that conflict — one transaction wins fully, the other is cancelled and surfaces as a clean 409 the console can retry, without ever having landed any of its rows. That's the substantive improvement over the cheaper "serialise the console writes" workaround, which kept the inter-publisher race but lost the all-or-nothing guarantee that turns it from "silent partial state" into "recoverable conflict". This supersedes the workaround. Transaction-size cap. DDB caps a singletransact_write_itemsat 100 actions, so the publish path enforces a publish-level ceiling: ≤ 97 total overrides + deletes (= 100 − superset − global −#static), rejected up front with a numeric breakdown rather than letting DDB cancel mid-flight. This is tighter than Decision 8's per-context limit (100 each, 200 total), which still governs the per-row CRUD path used by service-to-service fan-out; merchant usage is in the low tens of overrides in practice, so the ceiling has plenty of headroom. If we ever approach it, the lift is to lower Decision 8's per-context cap to align — not to chunk the publish, which would defeat the atomicity guarantee this whole change exists for. - Blob-shape drift (TS proxy ↔ Python exporter). The
facetSuperset(incl. each field'stype) /globalFacets/triggers[md5].facetOverrideshapes are a cross-language contract. A single shared fixture file was not adopted (it would only auto-enforce drift if both hermetic toolchains loaded the same file, which needs cross-component Pants wiring for marginal gain). Instead each side independently pins the shape in its own test, and the load-bearing invariant — the collection-override md5 (get_trigger_hash/getMerchandisingHashKeyparity, via thepage → collectionremap) — is pinned to the same literal in both the exporter and proxy tests with cross-reference comments. Thetypevocabulary (text/numeric/array/boolean→ Marqostring/array/number/boolean) is mapped exporter-side only; the proxy consumes Marqo vocab and never re-maps. The controller mirrors the mapping for its direct-to-Marqo calls in P6's probe (_MARQO_FACET_TYPEinsearch_api.py). - OQ confirmations (Yihan):
- OQ-1 (Marqo
limit=1facet behaviour) — RESOLVED. Marqo computes facets via a separatelimit 0 | <grouping>Vespa query (seemarqo-internal/components/marqo/src/marqo/core/semi_structured_vespa_index/semi_structured_vespa_index.py:_generate_facet_queries) that runs over the full match set regardless of the hits-sidelimit. The probe holdslimit=1for cheapness. - OQ-2 (internal field set) — RESOLVED. Drop
^_and dotted keys (the existingis_base_fieldrule, equivalent to theis_internalqualifier used inadmin_worker). The seed additionally narrows totext+array<text>because those are the lexically-searchable types in a Marqo index. - OQ-5 (sync vs cached probe) — RESOLVED, no controller-side cache needed. The probe's underlying call goes to the ecom search proxy's
/searchand/collectionsendpoints, both of which areCacheableEndpoints incomponents/search_proxy/src/cache-config.ts. Cloudflare keys POST cache by URL + body and the probe body is deterministic for a given(account, index, context, trigger, superset)— so repeated drawer-opens hit the edge cache directly. Adding a controller-side cache would either double-cache (no benefit) or override an admin's deliberatecache_config.endpoints.search = 0decision (worse than no cache). The Classic Marqo path isn't cached, but facet merchandising targets ecom customers — not a real surface.
- OQ-1 (Marqo