Theme-Targeted Settings Deploy System
Status: Draft for review (Raynor)
Author: theme-deploys planning agent, 2026-06-10/11
Related plans: settings-concurrency-control.md, settings-versioning.md, storefront-admin-sso.md (see Cross-plan interfaces)
Problem
Every settings save from the storefront_admin editor goes straight to production. POST /api/v1/storefront/shops/{domain}/settings (components/shopify/admin_server/admin_server/routes/storefront_routes.py:149) writes the single DDB settings record and immediately mirrors it to the shop-level marqo.search_settings metafield, which the live storefront widget reads on the next page load. Bad CSS or a broken selector saved from the editor breaks live customer storefronts instantly, with no staging step and no explicit promotion.
Goals:
- (a) Per-theme settings records — different settings for different Shopify themes (live theme vs. redesign/preview themes).
- (b) Live-target detection — detect and warn when a save targets the LIVE theme.
- (c) Staging mode — saves go to a preview theme only; live customers never see them.
- (d) Explicit "Deploy to live" — a deliberate, guarded operation that promotes staged settings to production.
Current state (verified)
Persistence
- Settings live in DynamoDB table
ShopifyEntitiesas a single record per shop:pk=SHOP#{domain},sk=SETTINGS(components/shopify/admin_server/admin_server/repositories/shopify_settings_repository.py:19-22,models/shopify_entities.py:53-72, key constants inconstants/database.py:18). ShopifySettingscarriesui_components,selector_components,configuration, plus index-association fieldsactive_index/system_account_id/cell_id(models/shopify_entities.py:74-102) and audit fieldsupdated_by_user_id/created_at/last_updated.- Account→shop listing goes through
GSI_SystemAccountIdwith sort-key equality onsk = "SETTINGS"(repositories/shopify_settings_repository.py:81-100; equality confirmed inrepositories/base_repository.py:151-160). New sort keys prefixedSETTINGS#...will NOT leak into this listing.
Save path
- Storefront admin editor (the primary client):
components/storefront_admin/app/lib/api-client.ts:65-67→POST /api/v1/storefront/shops/{domain}/settings(routes/storefront_routes.py:149-256). Flow: validate →SettingsService.update_ui_components(DDB write,services/settings_service.py:470-499) →save_settings_to_metafields(mirror full settings JSON to shop metafield namespacemarqo, keysearch_settings;services/settings_service.py:158-199) →save_public_search_metafields(searchBase, storefrontAccessToken, baseCurrency, indexId, cdnBase;services/settings_service.py:201-307). Metafield failures return HTTP 207 "partial". - Embedded Shopify app save:
POST /api/settings(routes/settings_routes.py:60-127) — same service calls, session-token auth. - Metafields are written shop-level via
metafieldsSetwith the shop's owner ID (services/shopify_service.py:172-193, mutation ingraphql/mutations/metafield_mutations.py:5-23) — global across all themes. This is the core problem for theme scoping. - The editor save UI is a single Save button in
components/storefront_admin/app/components/layout/header.tsx:73-76, state managed byapp/hooks/use-settings.ts:325-357.
Widget config resolution (the crux)
- The theme app embed
components/shopify/extensions/marqo-search-theme/blocks/marqo-search-embed.liquid:9-25renders a<meta id="marqo-config">tag whosecontentattribute is the JSON value ofshop.metafields.marqo.search_settings, plus data-attributes for searchBase/indexId/etc. assets/marqo-loader.jsreadConfigFromDom()(marqo-loader.js:385-449) parses that meta tag intowindow.MarqoUIConfig; the main bundle validates it (components/shopify/storefront_search/src/marqo-search-app.ts:181-186,config-validator.ts:312).- So the settings the widget uses are decided server-side by Liquid at page render time. Liquid in app embeds has access to the global
themeobject (theme.id— numeric,theme.role—main/unpublished/demo/development). When a merchant previews an unpublished theme (?preview_theme_id=or theme editor preview), Liquid renders with that theme's context, sotheme.id/theme.rolereflect the preview theme. This is the lever for theme-scoped resolution. - The search proxy does not consume
ui_components(verified: nouiComponentsreads incomponents/search_proxy/src/);x-marqo-settings-overrideis index/search settings, a different domain. The only runtime consumer of UI settings is the metafield → Liquid → loader path. (storefront_routes.py:160mentions a "KV export via DDB stream" for settings; the existing exportercomponents/ecom_settings_exporterreads the IndexSettings table, not ShopifyEntities — treat that comment as aspirational; see Open questions.)
Live-theme detection
- Admin GraphQL exposes
themes(first: N, roles: [MAIN])/OnlineStoreTheme { id name role }. No theme query exists yet ingraphql/queries/(only shop/product/collection/metafield/bulk queries) — one must be added. - All app configs already grant
read_themes,write_themes(components/shopify/shopify.app.prod.toml:23and every per-merchant toml) — no scope migration needed. - Storefront-admin API-key requests have no Shopify session; the Admin GraphQL token is recovered from stored OAuth sessions via
_get_access_token_for_shop(routes/storefront_routes.py:99-108). Theme listing therefore degrades when no token exists (see Edge cases).
Design
1. Data model: per-theme staged records
The existing record stays untouched as the live record — the thing production storefronts serve. Staged settings get sibling records keyed by theme:
| Record | pk | sk |
|---|---|---|
| Live (existing, unchanged) | SHOP#{domain} | SETTINGS |
| Staged, per theme (new) | SHOP#{domain} | SETTINGS#THEME#{theme_id} |
theme_id is the numeric Shopify OnlineStoreTheme id (stable across renames and across publish — publishing changes role, not id).
Plus one internal record for deploy safety (see deploy endpoint): sk=SETTINGS#DEPLOY#BACKUP — a rolling snapshot of the live record's content fields (ui_components, selector_components, configuration, plus audit info) taken transactionally at each deploy, giving an undo path before the versioning plan lands. It deliberately does not function as a restorable full record: restore is always a content-merge onto the current live record (see rollback), never a verbatim put — infra fields (active_index/system_account_id/cell_id/metadata) are exactly the fields other writers (index_service, webhook_service, api_key_routes) may legitimately change between deploy and rollback, and clobbering them with deploy-time values would, e.g., revive a deleted index id. This matches the versioning plan's content-fields-only restore rule (docs/plans/settings-versioning.md §"Infra fields … not versioned and never restored").
Staged and backup records carry no system_account_id (left None; model_dump() already strips it when None for the sparse GSI, models/shopify_entities.py:133-143) — account authorization always goes through the live record (dependencies.py:665-683), so staged/backup records must stay invisible to GSI_SystemAccountId by construction, not just by the current sk-equality query shape. They also don't carry active_index/cell_id; deploy reads those from the live record it is updating.
New fields on ShopifySettings (all optional, absent on live records — model stays one class; sk: Literal["SETTINGS"] at models/shopify_entities.py:70-72 relaxes to str with a validator enforcing the SETTINGS / SETTINGS#THEME#{id} / SETTINGS#DEPLOY#BACKUP shapes — a templated key cannot be a Literal; entity_type stays "SETTINGS"):
theme_id: str | None # numeric theme id, set only on staged records
theme_name: str | None # display only; refreshed on each save from GraphQL
created_from: str | None # "live" | "blank" — provenance of initial copy
Deployment audit fields on the live record:
deployed_from_theme_id: str | None
deployed_at: str | None
deployed_by: str | None
Repository additions (ShopifySettingsRepository):
get_theme_settings(domain, theme_id)/save_theme_settings(settings)/delete_theme_settings(domain, theme_id)— sameget_item/put_item/delete_itemplumbing with the new sk.list_theme_settings(domain)— reuses the existingBaseRepository.query_by_pk_and_sk_prefix(pk, "SETTINGS#THEME#")(base_repository.py:95-123); the prefix excludes both the live record and the backup record. No new query primitive needed.- No bespoke transaction method: the concurrency plan's canonical
save_settings(settings, *, expected_version, change_source, extra_transact_items=None)already accepts extra transaction items (their §4.2 — designed for exactly this consumer). Deploy passes its backupPut(and an optionalConditionCheckon the staged source record) viaextra_transact_items; the conditional live write itself is the canonical one. The domain methods above are sk-parameterized wrappers over key-agnosticBaseRepositoryprimitives — the existing methods hard-codesk=SETTINGS(shopify_settings_repository.py:34-35,62-63), per the concurrency teammate's implementation note.
A hot-path hygiene note that becomes more important with this plan: ShopifySessionRepository.list_shop_sessions sweeps the whole SHOP#{domain} partition with query_by_pk and filters client-side (shopify_session_repository.py:135), and it runs on every storefront save/deploy via _get_access_token_for_shop (storefront_routes.py:104). Settings records are large (real merchant payloads up to ~120KB), so each staged theme + the backup record adds discarded read volume to that sweep. The versioning plan already recommends switching it to query_by_pk_and_sk_prefix(pk, "USER#"); this plan adopts that one-line fix as part of implementation step 2 rather than leaving it as someone else's hygiene.
Why the same table/model rather than a new entity: staged records are the same shape, same validation (InputValidator.validate_ui_components), same service merge logic, and the deploy operation is a copy between two records in one table. Fail-fast and Model-Everything principles both hold with a single frozen model.
Default/fallback semantics: the live record is the default. A theme with no staged record falls back to live everywhere (widget resolution, GET API). A shop with no theme records behaves exactly as today.
2. Widget resolution: theme-scoped metafields + role-based Liquid selection
Shop-level metafields are global, but metafield keys are namespaced strings and Liquid supports dynamic key lookup on the metafields drop. Staged settings mirror to a per-theme key:
- Live: namespace
marqo, keysearch_settings(unchanged). - Staged: namespace
marqo, keysearch_settings_theme_{theme_id}(typejson). Each metafield value has its own size budget (see §7 Size limits), so staging does not eat into the live payload's headroom — the decisive advantage over packing theme payloads into the single live metafield.
Resolution happens in the Liquid embed (marqo-search-embed.liquid), which already renders per-request with the correct theme context:
{% assign marqo_settings = shop.metafields.marqo.search_settings %}
{% assign marqo_settings_source = 'live' %}
{% if theme.role != 'main' %}
{% assign marqo_theme_key = 'search_settings_theme_' | append: theme.id %}
{% assign marqo_theme_settings = shop.metafields.marqo[marqo_theme_key] %}
{% if marqo_theme_settings %}
{% assign marqo_settings = marqo_theme_settings %}
{% assign marqo_settings_source = 'theme' %}
{% endif %}
{% endif %}
…then data-has-settings, content (rendered as {{ marqo_settings.value | json | escape }} — the existing template renders .value of the metafield drop at marqo-search-embed.liquid:25 and the assigned variable must too), and new debug attributes data-theme-id="{{ theme.id }}" data-theme-role="{{ theme.role }}" data-settings-source="{{ marqo_settings_source }}" are rendered from marqo_settings. marqo-loader.js readConfigFromDom() surfaces themeId / themeRole / settingsSource on window.MarqoUIConfig for diagnostics; no other widget change is required — the bundle keeps consuming uiComponents/selectorComponents exactly as today.
The rule: the MAIN (live) theme always serves the live metafield, never a theme-keyed one. Non-main themes serve their theme-keyed metafield when present, else fall back to live. Consequences:
- Publishing a staging theme in Shopify admin can never silently push staged settings to customers — the moment
rolebecomesmain, that theme serves live settings. Promotion happens only through the explicit deploy operation. (The trade-off — "theme + its settings ship together on publish" — is rejected deliberately: it reintroduces an unguarded path to prod.) - Preview of an unpublished theme (
?preview_theme_id=, theme-editor preview pane) renders with that theme'stheme.id, so merchants/agents see staged settings on the real storefront with zero risk to live traffic.
Feasibility notes (verify in implementation step 1): theme is a global Liquid object available in theme app extension blocks; dynamic bracket lookup on a metafield namespace drop (shop.metafields.marqo[var]) is supported Liquid. Both are standard Shopify behavior but must be smoke-tested on a dev store before the backend work lands (implementation order below makes this step 1 precisely because it is the load-bearing assumption).
Plan B if the Liquid spike fails (e.g. dynamic metafield-key lookup turns out not to work in app embeds): keep the single search_settings metafield and embed staged payloads inside it as a themes sub-object keyed by theme id ({"uiComponents": ..., "themes": {"123": {...}}}); the Liquid embed renders data-theme-id="{{ theme.id }}" data-theme-role="{{ theme.role }}" (plain interpolation, certainly supported) and readConfigFromDom() in marqo-loader.js selects the sub-object when data-theme-role != "main". Same resolution rule, same backend record shape — only the mirror format and ~10 loader lines change. Cost: staged payloads share the live metafield's size budget, so Plan B caps the number of concurrently staged themes; this is why Plan A (per-theme keys) is preferred and verified first.
The widget bundle and loader are theme-agnostic assets served from the app extension/CDN — they are shared across themes. Only the settings are theme-scoped, which matches the product need (testing new CSS/layout settings against a redesign theme). Testing a new widget bundle per theme is out of scope.
3. API changes (admin_server, storefront routes)
All new endpoints live in routes/storefront_routes.py (API-key auth, the storefront_admin editor's surface). The embedded-app routes (settings_routes.py) are untouched in v1 (see Rollout).
GET /shops/{domain}/themes (new)
Lists themes via a new Admin GraphQL query (graphql/queries/theme_queries.py):
query getThemes($first: Int!) {
themes(first: $first) { nodes { id name role updatedAt } }
}
Response model ThemeResponse: theme_id (numeric string extracted from the GID), name, role, is_live (role == MAIN), has_staged_settings (joined against list_theme_settings). Requires an access token from _get_access_token_for_shop; if none exists, return 409 no_shopify_session. A token may also be present but stale/revoked — GraphQL errors (401/403 from Shopify) are caught and mapped to the same 409 no_shopify_session (not a 500), so the editor's degraded mode handles both identically (Edge cases #5). Other GraphQL failures bubble as 502.
GET /shops/{domain}/settings?theme_id={id} (extended)
- No
theme_id(default): live record, exactly today's behavior and response shape (back-compat for current editor build). theme_idpresent: return the staged record. If none exists, return the live settings withmeta.exists=falseso the editor can initialize a staging copy from live without a separate call.- Response gains a
metaobject:{target: "live"|"theme", theme_id, exists, is_live_theme, version, last_updated}—versionis the record'srecord_version(0 for legacy records, per the concurrency plan, which independently adds it to the GET payload;get_settings_with_defaultsalready returnslast_updated,settings_service.py:140, the route just doesn't surface it today,storefront_routes.py:140-143). The deploy dialog's guard value comes from here.
POST /shops/{domain}/settings?theme_id={id} (extended) — staged save
- No
theme_id: legacy live save, unchanged pipeline (DDB →search_settingsmetafield → public metafields). Back-compat for existing clients. Response gainstarget: "live"so updated clients can warn. theme_idpresent (server-side guard, requirement (b)):- Require an existing live record. Shop resolution already depends on it (
_resolve_storefront_shop_from_settingsreads the live record for account authorization,dependencies.py:665-683), and deploy needs itsactive_index/system_account_id/cell_id. No live record →409 no_live_settings("initialize live settings first"). This removes the "deploy with no live record" branch entirely. - Resolve themes via GraphQL. Fail fast if the theme id doesn't exist (404) or if its role is
MAIN(409 live_theme_save_rejected) — saving "to the live theme" is not a thing; live writes go through the no-param path or deploy. This re-check at save time closes the race where a theme is published between editor load and save. Stale-token GraphQL auth errors map to409 no_shopify_sessionas inGET /themes. - Size guard (new paths only): serialized settings JSON exceeding the JSON-metafield design ceiling (128KB — see §7 Size limits for the Shopify numbers and their caveats) →
422 settings_too_largebefore the DDB write — a staged record whose metafield can't be written is unpreviewable and therefore useless, so fail fast. The check lives in the new service methods (update_theme_ui_components, deploy), not inside the sharedsave_settings_to_metafields(settings_service.py:158), where it would leak into the legacy path. The legacy no-theme_idpath keeps today's exact behavior (size logged atsettings_service.py:178-180, oversize surfaces as 207 partial) — a regression test pins that a near-ceiling legacy save still succeeds unchanged. - Write the staged DDB record (
sk=SETTINGS#THEME#{id},theme_namerefreshed from the GraphQL response) — versioned from birth: the staged write implements the concurrency plan's canonical conditional-write contract from day one (request body carriesversion= the staged record'srecord_versionfrom GETmeta.version; first create uses theattribute_not_exists(record_version)condition; mismatch → the canonicalsettings_conflict409). The staged path is brand new with exactly one client (the new editor build), so unlike the live record there is no legacy-client transition to manage — it can be strict immediately, and no lost-update window ever exists for staged records. This makes the staged write path dependent on concurrency Phase 1's repo primitives (dependency stated in §5). - Mirror to metafield
search_settings_theme_{id}via the existingsave_settings_to_metafieldsextended with ametafield_keyparameter (default"search_settings"). - Do not touch
search_settingsor the public metafields (searchBase/indexId/etc. are infrastructure config, identical across themes; staging them has no meaning). - Metafield mirror failure → 207 partial as the live path does today (
storefront_routes.py:229-236) — but the message must say "staged settings saved but not previewable yet; retry" (staged settings are only served via the metafield; there is no other propagation path).
- Require an existing live record. Shop resolution already depends on it (
SettingsService gains update_theme_ui_components(shop_id, theme_id, theme_name, ui_components, selector_components, user). This is not a pass-through to create_or_update_settings — _create_default_settings hardcodes sk=SETTINGS_SK (settings_service.py:459-462). It reuses _merge_settings for updates to an existing staged record, and a new _create_theme_settings constructor for first saves: initialized from the live record's content fields only (created_from="live"; live record guaranteed to exist per guard 1) with the theme sk and theme metadata, infra fields left None (§1). (created_from="blank" is reserved for a possible future "start from defaults" editor action; v1 always copies live.)
POST /shops/{domain}/settings/deploy (new) — requirement (d)
Body: {source_theme_id: str, expected_live_version: int, expected_source_version: int | null}.
Deploy is a guarded, reversible write. Its guard is the concurrency plan's record_version optimistic lock — deploy depends on that plan's Phase 1 (server-side record_version attribute + _version_condition builder + conditional save_settings; independently deployable, no client changes required — see docs/plans/settings-concurrency-control.md §3, §10). That plan explicitly rejected last_updated as a condition attribute (untrustworthy writers, no-timezone timestamps, equal-timestamp collisions — its §3.1), so this plan does not ship a divergent transitional guard; it sequences after the agreed one. Steps:
- Load staged record; 404 if absent. (Deploying a theme with no staged settings is a no-op error, not a silent success.) Size guard as in staged save.
- Build the live-record update: the staged record's
ui_components,selector_components,configurationapplied onto the current live record's other fields (infra fieldsactive_index/system_account_id/cell_id/metadatacome from the live record just read — staged records don't carry them); stampdeployed_from_theme_id,deployed_at,deployed_by(the auth plan's canonical actor string:user:{sub}|token:{token_id}|api_key:{system_account_id}),last_updated,updated_by_user_id. - Route the write through the canonical service path, per agreement with both sibling planners:
SettingsService.create_or_update_settings(shop_id, content, updated_by, expected_version=expected_live_version, change_source="theme_deploy", event_type="deploy", source_scope=f"theme:{theme_id}", source_version=staged.record_version)— so input validation, the concurrency conditional write, and (once landed) versioning's history capture all fire from one place. The rolling content backupPuttosk=SETTINGS#DEPLOY#BACKUP(stampedbacked_up_at,backup_of_live_version) rides the sameTransactWriteItemsvia the repo'sextra_transact_itemshook; whenexpected_source_versionis supplied, aConditionCheckon the staged record ("staged unchanged since the dialog was opened") joins it. Live-condition failure →409with the concurrency plan'sSettingsConflictErrorpayload ({detail: {code: "settings_conflict", expectedVersion, currentVersion, lastUpdated, updatedBy, changeSource}}); legacy live records without the attribute are version 0 and guarded byattribute_not_exists(record_version)exactly as in that plan. - When the versioning plan lands, the same call produces the deploy version event automatically (their
event_type="deploy",source_scope,source_versionmapping — confirmed by that planner), and the rolling backup + rollback endpoint below are retired in favor of general restore. Until then, the backup record is the rollback story. - Mirror to the live
search_settingsmetafield +save_public_search_metafields(same as a live save). Metafield failure → 207 partial withstatus: "deployed_not_live"— and the editor must show this loudly, because the runtime read path is the metafield, not DDB (marqo-search-embed.liquid:25): a 207 deploy means customers are still seeing the OLD settings until a retry succeeds. There is no background reconciliation (the "KV export" mentioned atstorefront_routes.py:218,233does not exist for ShopifyEntities — see Open questions), so retry is the client's job and the response must say so explicitly. - The staged record is kept (not deleted) — the staging theme keeps serving its own settings, and the merchant can iterate and re-deploy. Re-running deploy is content-convergent (live content and metafield converge to the staged content) though not byte-idempotent — each run stamps fresh
deployed_at, incrementsrecord_version, and rolls the backup. Response:{status, deployed_at, source_theme_id, live_version}—live_versionis the new liverecord_version, so the editor can chain a follow-up action without a re-GET (matches the concurrency plan's save responses).
POST /shops/{domain}/settings/deploy/rollback (new) — undo before versioning lands
Body: {expected_live_version: int}.
Rollback is a content-fields-only merge, never a record swap: read the current live record, replace its ui_components/selector_components/configuration with the backup's content, keep every infra field (active_index/system_account_id/cell_id/metadata) from the current record, stamp fresh audit fields (change_source="theme_deploy", rolled_back_from_version), and write with _version_condition(expected_live_version); then mirror to the live metafield. The current live content is simultaneously written to the backup record (same transaction), so rollback is itself reversible. 404 if no backup exists; 409 on version conflict.
Why merge, not swap: infra fields are exactly what other writers legitimately change between deploy and rollback — index_service clears active_index on index deletion and sets it on creation, webhook_service rewrites metadata — and restoring deploy-time values would, e.g., point the indexId metafield at a deleted index. This mirrors the versioning plan's "infra fields are never restored" rule, and a dedicated test pins it (rollback after an active_index change preserves the new active_index).
Atomicity: DDB and Shopify metafields are two systems; a single atomic commit is impossible. Within DDB, backup+promote (and rollback's swap-back) are atomic transactions. Across systems the order is DDB-first, mirror second, explicit client retry on partial — the same consistency model the existing save path already uses (207 handling at storefront_routes.py:210-236), with the honest caveat in step 5 about what 207 means for live traffic.
Auth (per the locked scope matrix in docs/plans/storefront-admin-sso.md §4.5): POST /deploy requires settings:deploy_live only (promote-but-not-edit is a valid reviewer persona); POST /deploy/rollback requires settings:write + settings:deploy_live (it chooses non-head content to put live — deliberately aligned with their stricter restore-to-live rule); the no-theme_id live save requires both once scoped tokens land.
DELETE /shops/{domain}/settings/themes/{theme_id} (new)
Deletes the staged DDB record and its search_settings_theme_{id} metafield. Note: metafield deletion is net-new GraphQL surface — the codebase only has metafieldsSet today (graphql/mutations/metafield_mutations.py:5-23). Needs the metafieldsDelete mutation, a ShopifyService.delete_metafield method with userErrors handling mirroring the existing pattern (shopify_service.py:165-171), and a response transformer + test. Tolerates an already-absent metafield. Used by the editor's "discard staging" action and for cleaning up records for deleted themes.
4. Editor UX (storefront_admin)
- Theme picker in the editor header (
components/layout/header.tsx+ newtheme-picker.tsx): dropdown of themes fromGET /themes— entries likeLive — Dawn (current)(amber/red LIVE badge) andPreview — Dawn Redesign(+ "staged changes" dot whenhas_staged_settings). Selecting a target reloads settings for that target viaGET /settings?theme_id=. - Default target: the live theme, preserving current behavior and muscle memory — but with the LIVE badge and save-guard below, the "didn't realize I was editing prod" failure mode is gone. (Defaulting to a staging theme was considered and rejected: which one? and silently editing a theme the merchant didn't pick is its own surprise.)
- Save behavior (requirements (b)+(c)):
- Target = staging theme: button reads
Save to "<name>"— plain save, no friction. - Target = live: button reads
Save to LIVEwith warning styling; clicking opens a confirm dialog ("This updates the live storefront for all customers immediately. Consider saving to a preview theme and deploying instead."). Confirmation state is per-session, not per-click-forever (no "don't ask again" persistence in v1).
- Target = staging theme: button reads
- Deploy button: visible when target is a staging theme with staged settings; opens a confirm dialog showing source theme name, live
last_updated/updated_by, and a coarse diff summary (counts of components whose serialized JSON differs between staged and live; full visual diff is the versioning plan's territory). Opening the dialog fetches the live record fresh (GET /settings, no theme param) — the live payload is not in memory while editing a staging target — and that fetch supplies both the diff baseline and theexpected_live_version(meta.version) sent withPOST /deploy. On 409 version conflict: re-fetch, re-show dialog with a "live changed since you opened this" banner. On207 deployed_not_live: persistent error banner with a retry action — customers are still on the old settings until retry succeeds. A "Roll back last deploy" action (calls/deploy/rollback) lives behind an overflow menu with its own confirm. - Preview link: when target is a staging theme, a "Preview on storefront" link to
https://{domain}/?preview_theme_id={theme_id}(and the existing search-preview iframe keeps rendering local state as today —components/preview/search-preview.tsxis target-agnostic since it renders from in-memory settings). - State plumbing — this is a real refactor of
use-settings.ts, not a parameter add. Today the hook is single-record: onebackendSnapshot, oneeditVersionRef, one load effect keyed on[getClient, shopifyDomain](use-settings.ts:271-323). It becomes target-keyed:targetThemeIdstate joins the load-effect deps (target switch = reload),backendSnapshot/isDirty/editVersionRefreset on target switch, and switching with unsaved changes prompts (discard/save-first).getSettings/saveSettingsinapi-client.tstake optionalthemeId; newlistThemes,deploySettings,rollbackDeploy,deleteThemeSettingsclient methods;settings-context.tsxexposes target + deploy actions. Holding parallel per-target edit buffers was considered and rejected for v1 (reload-on-switch is simpler and the prompt prevents data loss). - Degraded mode:
GET /themes→ 409no_shopify_session: hide the picker, show a banner "Theme staging unavailable for this shop (no Shopify session) — saves go directly to live", keep today's exact flow.
5. Migration & rollout (backward compatible)
There is no data migration. Existing single-record shops are already in the target state: their record is the live record; staged records appear lazily on first theme-targeted save.
Deploy order (each step independently safe):
- Theme extension (Liquid + loader): the new resolution block is pure fallback — shops with no theme metafields take the
marqo_settings = shop.metafields.marqo.search_settingspath identical to today. Theme app extension versions roll out globally to all shops on release, so this must be merged with the fallback verified in e2e/manual testing first. - Backend (model fields, repo methods, routes, GraphQL theme query, metafield key parameter): additive; legacy request shapes (no
theme_id) hit unchanged code paths. New optional model fields require no backfill (Pydantic defaults). - Editor (theme picker, guarded save, deploy): ships last; old editor builds keep calling the legacy shapes.
Coordination with sibling plans: every settings WRITE this plan introduces — staged saves, deploy, rollback — depends on concurrency Phase 1's repo primitives (server-side record_version + _version_condition + conditional save; independently deployable, no client coordination — settings-concurrency-control.md §10). Staged saves are deliberately versioned from birth (§3 step 4): the path has no pre-existing clients, so shipping it un-versioned and retrofitting later would create exactly the lost-update window the concurrency plan exists to close — there is no "staged saves before Phase 1" configuration. What IS dependency-free: the Liquid/theme-extension resolution, theme listing, the GET extensions, DELETE, and the editor's read/preview UX — those can land in any order, but requirements (a)–(d) all activate only with Phase 1 in place. Phase 1 is the smallest, first, server-only step of the four-plan program, so this gates little in practice. Deploy/rollback additionally carry their own transactional backup + content-merge rollback until versioning's general restore supersedes both; when versioning lands, deploy emits its event_type="deploy" version event and the backup/rollback machinery is retired. Scoped-token enforcement on deploy activates when the auth plan lands; until then the endpoint is full-access-API-key-only (current auth model), which is no weaker than today's live save.
6. Edge cases
- Staging theme gets published (role → MAIN): resolution rule instantly serves live settings on it (no leak). Its staged record stays; the editor shows it under its new role and
POST /settings?theme_id=now rejects it (409) — the merchant deploys it properly or discards. Deploy from it remains allowed (deploy copies content; the source's role is irrelevant — but the deploy response warns when source role is MAIN so the editor can suggest cleanup). - Live theme unpublished (another published): the old MAIN theme becomes
unpublished; if it has a stale staged record, that now starts serving on previews of it — correct semantics (it's a preview theme now). Live traffic serves the new MAIN theme → live metafield. No action needed. - Theme deleted in Shopify: staged record + metafield orphaned.
GET /themesjoins records to themes; records without a matching theme are listed asrole: "deleted"with delete-only affordance in the editor. Orphaned metafields are inert (nothing resolves them) and removed by the DELETE endpoint. - Theme renamed:
theme_idis stable;theme_nameon the record is display-only and refreshed on every staged save and theme listing. - No or stale Shopify access token (session expired/never installed via OAuth/token revoked): theme listing impossible → both "no token" and "token rejected by Shopify" map to
409 no_shopify_session→ degraded live-only editor mode; staged saves would 409 at the role-check step anyway. Documented, not silent, never a 500. - Metafield size: see §7 — the JSON-metafield write limit is the binding constraint end-to-end and large merchants are already close to the future 128KB ceiling. New paths (staged save, deploy) fail fast at the ceiling with 422 (§3); the legacy live path is deliberately left byte-identical (oversize → 207, as today) to avoid regressing CSS-heavy shops — with a regression test pinning that.
6a. App embed disabled on the staging theme: app embeds are enabled per theme. A theme duplicated from the live theme inherits the Marqo embed's enablement (the normal redesign workflow — works out of the box), but a freshly installed theme-store theme has it disabled → no
#marqo-configmeta tag → staged settings can't be previewed there at all. The editor's preview link section shows a hint ("If search doesn't appear, enable the Marqo Search app embed for this theme in the Shopify theme editor"). Detecting enablement programmatically (reading the theme'sconfig/settings_data.jsonvia the Asset API) is out of scope for v1. - Two editors staging the same theme: protected by the per-(pk,sk) optimistic lock from the staged path's first release — staged saves are versioned from birth (§3 step 4), so the second writer gets the canonical
settings_conflict409 and refreshes. Unlike the live record, staged records never pass through an unguarded last-writer-wins phase. Shopify.themeJS global absent/blocked: irrelevant — resolution is Liquid-side; the JS global is never load-bearing.- Markets/locale domains:
localization.*attributes in the embed are orthogonal; theme resolution is per-rendering-theme regardless of market.
7. Size limits & sharding trajectory (the 400KB question)
Raynor asked all settings planners what happens when settings outgrow DynamoDB's 400KB item cap, and whether split-records (shards) are needed. This plan's angle, aligned with the statements in settings-concurrency-control.md §6.1 and settings-versioning.md §"sharding":
Per-theme records multiply record COUNT, not item size. Each staged record is a full settings document in its own DDB item with its own independent 400KB budget and its own lock counter. The largest real merchant payload today is ~120KB (Muji CA; ~30% of the DDB cap). This feature adds zero new pressure on any single item's size.
The binding constraint end-to-end is the Shopify metafield mirror, not DDB. Most metafield types cap at 64KB, but json-type values (what search_settings uses, settings_service.py:183-187) are currently 2MB for apps that used JSON metafields before April 2026 (grandfathered — this app qualifies), dropping to 128KB per write on API version 2026-04+ per the Shopify changelog ("Reduced metafield value sizes") — confirm the exact effective version and grandfathering scope during the implementation spike (step 1). Two API-version pins matter and they differ: the GraphQL client performing the metafield writes pins ShopifyAPI.VERSION = "2024-01" (constants/shopify.py:11) — the pin that governs the write-path limit — while all 16 shopify.app.*.toml manifests declare api_version = "2025-04" (webhook/extension surface). Bumping ShopifyAPI.VERSION past 2026-04 is the explicit 128KB tripwire; that bump must not happen without checking per-shop mirror sizes against the new cap. Since the bump is eventually inevitable, 128KB is the design ceiling — and the largest merchant payload (~120KB, Muji CA) is already at ~94% of it. Ordering of walls: metafield 128KB → DDB 400KB → TransactWriteItems 4MB (the deploy transaction is 2 items + 1 condition check, ~240KB worst case today — never binding). The per-theme metafield keys help here: each staged theme gets its own 128KB budget instead of sharing one value (and Plan B in §2 would forfeit exactly that).
Consequences adopted in this plan: the new-path 422 guard (§3) checks the serialized mirror JSON against the 128KB ceiling (not DDB's 400KB, which the metafield wall makes unreachable); implementation should add a warn-level size log/metric at 100KB per (shop, theme) alongside the existing size log (settings_service.py:178-180), complementing the concurrency plan's 300KB DDB alarm.
Sharding: defer, with the shared trigger. This plan adopts the cross-plan trajectory verbatim: single item per scope now; if a shop crosses the 300KB DDB alarm, move to manifest + N shard items behind a settings_schema_version bump with the lock counter on the manifest only — the key shape extends naturally (SETTINGS#THEME#{id} stays the root/manifest; shards would be SETTINGS#THEME#{id}#SHARD#{n}), and expected_live_version semantics are unaffected because the version always lives on exactly one root item per scope. But note the honest ordering above: a payload big enough to need DDB sharding has already broken the single-metafield distribution model at 128KB, so the realistic trigger is metafield growth, and the realistic response is splitting the mirror (per-component metafields, or serving settings from CDN/KV instead of metafields) — alarmed here, designed when actually approached, out of scope for this plan.
Cross-plan interfaces
Interface proposals were exchanged and confirmed by all three teammates (2026-06-10/11); the agreed contracts below are also recorded in their plans' cross-plan sections.
- settings-concurrency (
docs/plans/settings-concurrency-control.md, confirmed): locking is strictly per (pk, sk) record —SETTINGS#THEME#{id}inherits the mechanism unchanged with its own independent counter. Adopted contract: storage attributerecord_version(exposed asversionin API payloads), legacy/absent ≡ version 0 viaattribute_not_exists, helpers_version_condition(expected_version)/put_item_versioned/update_item_versionedonBaseRepository, and canonicalsave_settings(settings, *, expected_version, change_source, extra_transact_items=None)— deploy's backupPutand optional stagedConditionCheckrideextra_transact_items. Conflicts raiseSettingsConflictError→ 409 with{detail: {code: "settings_conflict", expectedVersion, currentVersion, lastUpdated, updatedBy, changeSource}}. Agreed rules: deploy routes throughSettingsService.create_or_update_settings(..., expected_version, change_source="theme_deploy")(value added to their enum) rather than the repo; the source counter is never copied onto the target — live increments from its own value; deploy responses return the new liveversion. Deploy depends on their Phase 1 (server-side only, independently deployable); they rejectedlast_updatedas a condition attribute and this plan follows that rejection. - settings-versioning (
docs/plans/settings-versioning.md, confirmed): history lives in a dedicatedSHOPVER#{domain}partition (NOT underSHOP#— partition-sweep isolation), sklive#{record_version:010d}/theme#{theme_id}#{record_version:010d}— per-record histories cannot interleave, and theme history numbering is per-theme because each theme record has its own counter. Deploy event mapping (their schema, confirmed): the deploy/save distinction isevent_type="deploy"(NOTchange_source, which the concurrency plan owns as the writing-path label); this plan'ssource_theme_id→source_scope="theme:{theme_id}"+source_version=<staged record_version deployed>;deployed_by→author_id(canonical actor string). Capture hooks insideSettingsService.create_or_update_settings, whose signature they're extending with the deploy kwargs — deploy gets history capture for free by routing through it (§3 step 3). Their v1 captures live-scope only; wiring capture for saves to staged theme records is a small extension owned by this plan (a target-record param on the shared path or a directSettingsVersionService.capturecall — converge at implementation). Their restore is content-fields-only — the same rule this plan's rollback follows — and supersedes the rolling backup + rollback endpoint when it lands. - storefront-admin-auth (
docs/plans/storefront-admin-sso.md, confirmed — scope names adopted verbatim in their §4.5/§8.1 matrix): staged save + theme-record delete →settings:write; direct live save (notheme_id) →settings:write+settings:deploy_live;POST /deploy→settings:deploy_liveonly (independent-flags accepted; promote-but-not-edit persona works);POST /deploy/rollback→ both (aligned with their stricter restore-to-live rule). Sessions carry both scopes (interactive auth + this plan's confirm dialogs as the gate); CLI tokens getdeploy_liveonly via warning-gated opt-in; enforcement is immediate when scoped tokens land (resolves Open question 2). Actor identity fordeployed_by/audit fields:user:{cognito_sub}|token:{token_id}|api_key:{system_account_id}. Legacy raw API keys: wrapped by their newauthenticate_storefront_requestdependency with all settings scopes andshops=('*',)— expressed in code on the legacy branch, no data migration — persisting through theirSTOREFRONT_LEGACY_KEYSallow→warn→deny ratchet, so existing API-key callers (including deploy) keep working untildeny.
Test plan
Per CLAUDE.md, tests are a completion precondition for every new branch and error path.
Backend (pants test //components/shopify/admin_server::)
- Repository: theme record round-trip with new sk;
list_theme_settingsreturns onlySETTINGS#THEME#*(excludes live andSETTINGS#DEPLOY#BACKUP);list_settings_by_system_accountexcludes theme and backup records; delete removes only the targeted record; the deploy transaction (canonicalsave_settings+extra_transact_items) on condition failure surfacesSettingsConflictErrorwith neither item written. - Service: first staged save copies live (
created_from="live"); staged save without a live record →no_live_settingserror; staged merge semantics match live merge;save_settings_to_metafieldswritessearch_settings_theme_{id}for staged andsearch_settingsfor live; over-limit payload on staged save fails fast (422 semantics) before the DDB write; staged save with a staleversion→ canonicalsettings_conflict409 with no write, and first staged create succeeds viaattribute_not_exists(record_version); regression: a near-ceiling legacy live save still succeeds exactly as today — fixture parameterized at ~95% of the 128KB metafield design ceiling (~122KB, mirroring the largest real payload, ~120KB Muji CA) and derived from the same constant the 422 guard uses, so the test tracks the ceiling if it changes (DDB written, 207 on metafield failure, no 422); deploy copy preservesactive_index/system_account_id/cell_idand stamps deploy audit fields; sk validator accepts the three shapes and rejects others. - Routes (
storefront_routes_test.py):GET /themeshappy path (mocked GraphQL), no-token 409, stale-token (Shopify 401) → 409 not 500,has_staged_settingsjoin;GET /settings?theme_idexists/not-existsmetashapes and live fallback payload;POST /settings?theme_idrejects MAIN-role theme (409), unknown theme (404), missing live record (409), writes correct record + metafield key, 207 partial with the "not previewable" message on metafield failure, no public-metafield writes on staged save; legacy no-param save byte-identical behavior (regression); deploy: success writes backup + live transactionally with incrementedrecord_version, missing staged record 404,record_versionmismatch → 409 with neither live nor backup mutated, legacy version-0 live record deploys viaattribute_not_exists, metafield failure → 207deployed_not_live, re-run converges content, staged record retained, deployed live record preservesactive_index/system_account_id/cell_id/metadatafrom the current live record; rollback: restores backup content onto current live, preserves anactive_index/metadatachanged after the deploy (the clobber test — must fail against a verbatim-swap implementation), swaps current live content into backup (rollback is reversible), 404 with no backup, 409 on conflict; staged + backup records carry nosystem_account_idand never appear inGSI_SystemAccountId; DELETE removes record + metafield and tolerates missing metafield. - GraphQL: theme query GID→numeric id extraction.
Editor (npm test in components/storefront_admin)
api-client: themeId query param threading, deploy/rollback/delete/listThemes request shapes.use-settings: target-keyed reload, dirty reset on target switch, switch-with-unsaved-changes prompt, staged saves sendversionfrommeta.versionand surface the 409 conflict refresh prompt, deploy flow incl. fresh live fetch forexpected_live_version, 409 conflict refresh path, 207deployed_not_livebanner state.- Components: theme picker rendering (live badge, staged dot, deleted-theme entry), live-save confirm dialog gating the save call, deploy dialog content + conflict banner + retry banner, rollback confirm, degraded no-session banner, embed-enablement hint.
Liquid/widget (manual + e2e)
- Dev store smoke test (step 1 of implementation): duplicate theme, write a theme-keyed metafield by hand, verify dynamic-key lookup and role-based selection in preview vs. live, verify
data-settings-source. Then an e2e scenario incomponents/shopify/e2e_tests(CI): staged save → preview URL shows staged CSS, live URL unchanged → deploy → live shows it. Loader change (readConfigFromDomnew attributes) covered by existing storefront_search vitest patterns if present, else by the e2e.
Implementation order
- Liquid feasibility spike on a dev store — load-bearing assumptions, fail fast; fall back to Plan B (§2) if it fails. Checklist: (a)
theme.id/theme.roleare available inside atarget: "head"app-embed block specifically (global-object docs say yes; verify in this exact rendering context); (b) dynamic bracket lookupshop.metafields.marqo[var]; (c) role-based selection behaves correctly under?preview_theme_id=and the theme-editor preview; (d) confirm the JSON-metafield write limit effective forShopifyAPI.VERSION(§7). - Model + repository + service (staged records, sk validator, metafield key param, new-path size guard,
list_shop_sessionssk-prefix hygiene fix) + tests. The staged save methods build on concurrency Phase 1's repo primitives (§5) — read paths and record shapes do not. - Theme GraphQL query +
GET /themes+ tests. - Extended GET/POST settings routes + tests (staged POST blocked on concurrency Phase 1, as step 2).
- Deploy + rollback + DELETE endpoints (canonical conditional save +
extra_transact_itemsbackup; content-merge rollback; net-newmetafieldsDeletemutation +ShopifyService.delete_metafield+userErrorstest per §3) + tests. Blocked on settings-concurrency Phase 1, as are the staged write paths in steps 2 and 4 (§5); steps 1, 3, 6, the read paths, and the editor work minus save/deploy wiring are not. - Theme extension change (Liquid + loader debug attrs) + e2e.
- Editor: client + hook refactor, theme picker, guarded save, deploy/rollback dialogs + tests.
Out of scope
- Per-theme public metafields (searchBase, indexId, storefrontAccessToken, baseCurrency, cdnBase) — infrastructure config, theme-invariant.
- Staging for index/search settings (
x-marqo-settings-overridealready serves tuning), merchandising rules, or promo metaobjects. - Per-theme widget bundle versions (settings only).
- Embedded Shopify app settings page (
settings_routes.py) and the deprecated admin_ui ui-customization page — they keep their current live-write behavior; adding the live-save warning there is a fast follow. - Visual diff of staged vs. live settings (versioning plan owns history/diff UX; v1 deploy dialog shows a coarse component-count diff only).
- Automatic deploy on Shopify theme publish events (explicitly rejected — publish must never imply settings promotion).
Open questions
storefront_routes.py:160,218,233claims settings propagate via "DDB stream → KV export", but no such path exists for ShopifyEntities (verified:ecom_settings_exporter/lambda_function.py:106,156reads onlyINDEX_SETTINGS_TABLE_NAME). The 207 response messages promising KV propagation are therefore misleading today. Fixing those two legacy messages is owned by the settings-concurrency plan (its §4.5 touches that route); this plan only ensures its new messages don't repeat the claim. If a ShopifyEntities→KV export is ever built, staged/backup records must be excluded from it (export onlysk=SETTINGS). To confirm with Raynor.ShouldResolved with storefront-admin-auth: enforcement is immediate for scoped tokens; legacy raw keys are exempt via their dependency's legacy branch until thesettings:deploy_liveon the no-theme_idPOST be enforced immediately when the auth plan lands?STOREFRONT_LEGACY_KEYSratchet reachesdeny(see Cross-plan interfaces).