Scheduled Merchandising Rules — Design Doc
Author: Ruchira (Richie) Jayasekara
Date: 2026-06-01
Summary
This document describes the design for scheduled merchandising rules in Marqo Cloud. Today, a set of merchandising rules for a given search query or collection (a "view") goes live the instant it is saved and stays live until manually deleted. This feature lets merchandisers attach scheduled overrides to a view: each override is a rule set with a publish window that temporarily replaces the view's always-on baseline rules while its window is active. A trigger can keep its everyday rules live now, swap in a campaign rule set for a window, and automatically revert afterwards — without going dark or losing the baseline.
The resolution happens in the Cloudflare worker at request time: the exporter writes the baseline plus all of a trigger's scheduled overrides into the trigger's KV entry, and the worker picks the override active at the current timestamp on each search. This reuses the worker's existing runtime-selection machinery (it already buckets A/B-test variants per request).
Problem Statement
Merchandisers run time-bound campaigns: a "Black Friday" set of pinned products that should go live at 00:00 on the Friday and come down at 00:00 on the Monday; a seasonal collection ordering that should appear only during a sale window; a homepage promotion that activates at a launch time. Today there is no way to express when a view's rules are active. The only options are:
- Manually save the rules at the exact desired start time, and manually delete them at the end time, or
- Build and maintain external automation (cron jobs, runbooks) to call our APIs at the right moments.
Both are error-prone, require human presence at inconvenient times, and provide no audit trail of intended schedules. Customers have asked for first-class scheduling so campaigns activate and deactivate automatically.
The system must let a user attach one or more scheduled overrides to a view, persist them, make the correct rule set affect customer search at the right time, and reflect the scheduled/active/expired state in the Console.
Glossary
- View: All merchandising rules for one
(context, trigger)pair — e.g. all pins/excludes/boosts for the search query "shoes". Stored as aMerchandisingViewRecordat DynamoDB sort keyRULES#{context}#{trigger}. - Trigger / Context / Profile-scoped view: the search query or collection;
searchorpage; a per-profile override atPROFILE_RULES#{context}#{profile_id}#{trigger}. - Controller: The Django REST service that performs CRUD on merchandising records and writes them to DynamoDB.
- Exporter: The merchandising exporter — an AWS Lambda (in
cloud_control_plane), triggered by EventBridge every 5 minutes, that reads recently-changed merchandising records from DynamoDB and writes them to Cloudflare KV. It is read-only on the DynamoDB merchandising table. - Worker: The Cloudflare worker (
cloud_data_plane/components/cloudflare-worker) that applies merchandising rules to live customer search requests by reading from KV. It already performs per-request runtime selection (A/B-test variant bucketing). - KV / Cloudflare KV: The key-value store the worker reads at query time. Each trigger's KV value carries the baseline rules plus embedded alternatives (
ab_tests,profile_rules, and nowscheduled_overrides). #staticrow: A DynamoDB bookkeeping row (pk = "#static",sk = "{system_account_id}-{index_name}") the Controller bumps (updated_at) on every merchandising write; the exporter uses it to find recently-changed indexes.- Baseline rules: A view's always-active rule set (the existing
rulesfield) — live whenever no override is active. - Scheduled override: A rule set with a publish window
(publishAt, unpublishAt)that replaces the baseline while the window contains the current time. A view may have several. - Active window:
(publishAt is None or now >= publishAt) and (unpublishAt is None or now < unpublishAt)— publish inclusive, unpublish exclusive. - Resolution: choosing the rule set in effect at
now(done by the worker): among the overrides whose window containsnow, the one with the latestpublishAtwins (Nonesorts earliest; ties broken by list order); if none are active, the baseline applies.
Tenets
In ranked order of priority:
- Reliability: A scheduled rule must reliably activate and deactivate. A missed transition (rules that never go live, or a promotion that never comes down) is a visible customer-impacting failure.
- Simplicity / Reuse: We reuse the worker's existing per-request selection machinery (it already picks an A/B variant per request) and the existing change-based export — no new scheduler, no new infra, exporter stays read-only.
- Customer Obsession: The Console must make the scheduled/active/expired state obvious and the controls intuitive.
- Backward compatibility: Existing views (no overrides) behave exactly as before, with no migration.
Functional Requirements
- FR-1: A user can attach one or more scheduled overrides — each a rule set with a publish time and/or unpublish time — to a view, alongside its baseline rules.
- FR-2: An override with a publish time in the future must not affect customer search until that time is reached.
- FR-3: An override with an unpublish time in the past must not affect customer search after that time.
- FR-4: An override with a publish time but no unpublish time stays active indefinitely once published (until superseded by a later-publishing override or removed).
- FR-5: A view with no overrides behaves exactly as today — its baseline rules are active immediately and indefinitely; no migration of existing data.
- FR-6: When no override is active, the view's baseline rules apply. This lets a user keep current rules live now AND schedule a different rule set for a window without going dark or overwriting the baseline.
- FR-7: The Console displays each trigger's scheduled-override state: Scheduled (an override is upcoming), Active (an override is live now), or Expired (all overrides are past); no chip when there are no overrides.
- FR-8: Overrides apply at the level of a whole rule set for a
(context, trigger)default view only. Scheduled overrides on profile-scoped views are NOT supported — the exporter and worker resolve overrides only for default views (see Out Of Scope). - FR-9: An invalid override is rejected at write time with a client error — publish time at or after unpublish time, neither bound set, a non-ISO timestamp, or any override on a profile-scoped view (scheduling is default-profile only — see FR-8).
- FR-10: Overlapping windows are allowed. A trigger may carry overrides whose windows overlap; when more than one is active at the same instant the worker resolves deterministically (latest
publishAtwins, ties by list order). The Console surfaces a non-blocking client-side warning when the author schedules a rule whose window overlaps an existing one, naming the conflicting rule(s).
Non-Functional Requirements
- NFR-1: Once a schedule is present in KV, activation/deactivation is exact — the worker re-evaluates the window on every request against the current timestamp, so a transition takes effect at its boundary (no polling delay). The only latency is propagating a newly edited schedule into KV (export cadence ~5 min + the worker's KV cache TTL, currently 5 min).
- NFR-2: The feature adds no additional KV lookup at search time — the overrides ride in the trigger's existing KV value. It does add a small per-request CPU cost (iterate the override list, parse timestamps).
- NFR-3: No new AWS or Cloudflare infrastructure (no new Lambda, schedule, table, stream, or worker route).
- NFR-4: Overlapping windows are allowed; when more than one override is active for a trigger at the same instant the worker resolves deterministically (latest
publishAtwins, ties by list order). The Console warns the author about overlaps at authoring time, but does not block them. - NFR-5: Timestamps are absolute UTC instants — stored, exported, and compared as ISO-8601 strings with an explicit UTC designator, end to end. See Timestamps and time zones.
Out Of Scope
- Per-rule scheduling — scheduling a single pin/exclude/boost independently. Covered by override-rule-set scheduling with far less complexity.
- Scheduled overrides on profile-scoped views are NOT supported, and this is enforced. Only default (
RULES#) triggers resolve overrides; profile (PROFILE_RULES#) views export and apply only their baseline rules. The exporter reads overrides solely from default views (get_scheduled_override_defsqueriesRULES#), and the worker resolves overrides only on the default trigger entry. Enforcement: the Controller rejects anyscheduled_overrideson a profile view (ProfileMerchandisingViewRecord.validate→ HTTP 400), and the Console hides the Schedule panel/action whenever a non-default profile is selected. Combining per-profile selection with per-time selection at runtime is a possible later extension. - Scheduling of global rules and A/B test rule sets.
- Recurring schedules (e.g. "every weekend"). A window is a single contiguous interval.
Success Criteria
- FR-2/FR-3/FR-6 (correctness): Given a trigger with baseline
R0and an overrideR1for[T_p, T_u), customer search appliesR0beforeT_p,R1betweenT_pandT_u, andR0again afterT_u— verified by worker unit tests (with a frozen clock) and an end-to-end check (see Testing). - NFR-1 (timing): With the schedule already in KV, the worker selects the correct override at any instant. Manually verifiable by issuing searches either side of a boundary with the schedule pre-seeded.
- FR-5 (compatibility): Existing views with no overrides export and resolve unchanged; no
scheduledOverridesattribute is written for them. - FR-7 (UX): The Console shows the correct status chip per trigger.
API Design
The Controller API surfaces the stored model directly: an optional scheduledOverrides array on the view payload (each override mirrors a view's rule fields, plus a window and an optional name label, e.g. "Black Friday"). See the storage model below; validation rejects invalid windows and over-long names with 400 (FR-9). No new endpoints. The name is metadata for the Console only — it is not exported to KV (the worker doesn't need it to resolve).
The KV value the worker consumes gains a sibling scheduled_overrides list next to the existing ab_tests/profile_rules:
// KV key: {system_account_id}-{index_name}|{md5(triggerContext|trigger)}
{
"triggerContext": "search",
"trigger": "shoes",
"pin_rules": { "0": "everyday_hero" }, // baseline — applies when no override is active
"exclude_rules": [], "filter_string": null, "score_modifiers": null,
"inherit_global_score_modifiers": true, "inherit_global_filters": true,
"new_boost_bury_rules": false,
// existing siblings — alternative rule sets the worker selects among at request time:
"ab_tests": [
{
"testId": "test-123",
"testName": "hero swap",
"variants": [
{ "name": "control", "trafficPercent": 50, "rules": null }, // null → baseline
{
"name": "treatment",
"trafficPercent": 50,
"rules": { "pin_rules": { "0": "variant_hero" }, "exclude_rules": [], "filter_string": null,
"score_modifiers": null, "inherit_global_score_modifiers": true,
"inherit_global_filters": true, "reinforcement_learning": null,
"override_relevance": null, "recency": null }
}
]
}
],
"profile_rules": [
{
"profile_id": "vip",
"pin_rules": { "0": "vip_hero" }, "exclude_rules": null, "filter_string": null,
"score_modifiers": null, "inherit_global_score_modifiers": null,
"inherit_global_filters": null, "reinforcement_learning": null,
"override_relevance": null, "recency": null
}
],
// new sibling added by this feature:
"scheduled_overrides": [
{
"publishAt": "2026-11-27T00:00:00+00:00", // camelCase — the worker reads publishAt/unpublishAt
"unpublishAt": "2026-11-30T00:00:00+00:00",
"rules": { // same shape as the baseline fields (snake_case)
"pin_rules": { "0": "black_friday_hero" },
"exclude_rules": [], "filter_string": null, "score_modifiers": null,
"inherit_global_score_modifiers": true, "inherit_global_filters": true,
"reinforcement_learning": null, "override_relevance": null, "recency": null
}
}
]
}
This is exactly the ab_tests/profile_rules embedding convention: alternative rule sets stored under one trigger key, the worker selects one at request time.
Architecture
The merchandising data flow is unchanged in shape; scheduling adds an embedded list on the way out and a resolution step in the worker.
PUT view (+scheduledOverrides)
Console ─────────────────────────────▶ Controller (Django)
│ writes RULES# item: baseline `rules`
│ + `scheduledOverrides`; bumps #static.updated_at
▼
DynamoDB (MERCHANDISING_TABLE)
▲ │ (exporter reads only)
EventBridge (every 5 min) ──▶ Exporter (Lambda)
│ re-export recently-changed indexes:
│ embed ALL overrides under the trigger's KV value
▼
Cloudflare KV ──▶ Worker (per request)
│ getRulesForTrigger:
│ selectActiveOverride(now) →
│ override.rules or baseline
▼
Marqo search
Chosen approach: embed all overrides; resolve in the worker at request time.
-
Exporter embeds, does not resolve. For each trigger, the exporter reads the baseline plus every scheduled override (
get_scheduled_override_defs), resolves each override's stored rules into the KV-ready shape with the same parse path as the baseline (_resolve_override_rules→_parse_rule_payload+TriggerRules), and embeds them asscheduled_overrideson the trigger's KV value (_build_scheduled_overrides_data, mirroring_build_profile_rules_data). Overrides whose window has already fully elapsed at export time (unpublish_at≤ now) are dropped — they can never be active again, so there's no point shipping them to KV; the future, active, and open-ended overrides are all exported. (Expired overrides remain in DynamoDB so the Console can still show/edit history; this only trims KV.) The exported list is sorted ascending bypublish_at(_override_publish_sort_key; absentpublish_atsorts earliest, stable for equal keys) so the worker can short-circuit resolution — see step 2. A trigger with overrides but an empty baseline gets a standalone KV entry (same mechanism as profile-only triggers); a trigger whose overrides have all elapsed contributes nothing. -
Worker resolves at request time. In
getRulesForTrigger(rule-loader.ts),selectActiveOverride(scheduled_overrides, Date.now())returns the override whose window contains now (latestpublishAtwins; ties by list order), or null. Because the exporter ships the list sorted ascending bypublishAt, the worker stops iterating at the first entry that publishes in the future (every later entry is also future) — an O(active-prefix) scan rather than O(all overrides). An active override'srulesreplace the effective base — the same full-replacement convention used for an A/B variant — preservingnew_boost_bury_rules/triggerContext/trigger. When none is active, the baseline applies.
Lifecycle — baseline R0 always live, one override R1 for 15:00–17:00:
| Request time | Active override | Rules applied |
|---|---|---|
| 14:59 | none | R0 (baseline) |
| 15:00 | R1 | R1 (campaign) |
| 17:00 | none | R0 (baseline) — auto-revert |
Because the worker evaluates now per request against the schedule already in KV, transitions are exact — no dependency on the export cadence (NFR-1). The exporter only re-runs to propagate edits (change-based, every 5 minutes).
Precedence with A/B tests. Override resolution runs first and sets the effective base; A/B-variant selection runs on top and can further replace it (so an A/B treatment wins over an active override for bucketed users). This composes the two independent dimensions without special-casing; documented so it's intentional.
Data Storage / Modeling
Store: the existing DynamoDB merchandising table. No new table. (See scheduled-rules-storage in the model: baseline rules + optional scheduledOverrides list, each entry carrying its own rules MerchandisingViewRules + window. Serialized camelCase at the top level; the inner pin/exclude objects keep snake_case keys (doc_id), matching what the exporter and worker read. Views with no overrides omit the attribute → no migration, FR-5. A per-view cap, MAX_SCHEDULED_OVERRIDES, bounds item size.)
KV value: as shown in API Design — scheduled_overrides is a list of {publish_at, unpublish_at, rules} embedded on the trigger entry, alongside ab_tests/profile_rules.
Volume / growth: each override adds a compact rule payload (stored pins are {doc_id, position}) plus two timestamps, bounded by the per-view cap and per-trigger limits, well within the 400 KB DynamoDB item and 25 MB KV value limits.
Timestamps and time zones
publishAt/unpublishAt are ISO-8601 timestamps that denote an absolute instant in UTC — they carry an explicit UTC designator (Z or +00:00), never a bare local wall-clock time. There is no separate "timezone" field; the instant is the contract. The same UTC string flows unchanged from the Controller through DynamoDB and the exporter into the KV value, and the worker compares it against the current instant. Concretely, at each boundary:
- Authoring (Console) — the
datetime-localpickers operate in the merchandiser's browser-local time zone. On change,localInputToIsodoesnew Date(local).toISOString(), converting that local wall time to the equivalent UTC instant (always...Z) before it leaves the browser. So "publish at 9am" is interpreted in the author's local zone at pick time and persisted as the corresponding UTC instant. The panel makes the zone explicit — it states the author's resolved IANA zone (e.g.America/New_York, fromIntl.DateTimeFormat().resolvedOptions().timeZone) above the pickers and in each picker's helper text — so the user is never guessing which zone the baredatetime-localinput means. - Display (Console) — stored UTC is rendered back into the viewer's local zone with the zone shown — each displayed window uses
toLocaleString(undefined, { timeZoneName: "short" })(e.g. "11/27/2026, 9:00:00 AM EST"). Two users in different zones each see the same instant labelled in their own local time, with no ambiguity about which zone is meant. - Storage & validation (Controller) — stored verbatim; parsed with
datetime.fromisoformat(Python 3.11+, which accepts bothZand numeric offsets). Per-window validity checks compare timezone-aware UTC values. Overlap between windows is not rejected — it is allowed and resolved by the worker. - Export (Exporter) — passes the strings through to KV unchanged. Its only time comparison — dropping fully-elapsed overrides — parses the value and defensively treats a timezone-naive timestamp as UTC before comparing to
datetime.now(timezone.utc). - Resolution (Worker) —
Date.parse(publishAt)honors the explicitZ/offset to get a UTC epoch, compared againstDate.now()(also UTC epoch). Because the Console always emits an explicit offset, there is no reliance on JavaScript's local-time fallback for offset-less strings.
Because every comparison is on absolute UTC instants, daylight-saving and zone changes are a non-issue — a window that spans a DST transition still activates/deactivates at the exact instants chosen. The only place a human time zone enters is the Console's pick-time/display conversion.
Low Level Design
Controller (components/controller/merchandise/): ScheduledRuleOverride (API) / ScheduledOverride (stored) models — each with an optional name (max MAX_SCHEDULED_OVERRIDE_NAME_LENGTH chars), scheduled_overrides on MerchandiseView/MerchandisingViewRecord, _build_view_rules shared mapping, and validation — per-override window validity, the per-view count cap (MAX_SCHEDULED_OVERRIDES), and ProfileMerchandisingViewRecord.validate which rejects any scheduled_overrides on a profile-scoped view (scheduling is default-profile only). Overlapping windows are not rejected — the Console warns about them and the worker resolves them.
Exporter (components/merchandising_exporter/src/):
dynamodb_client.py:get_updated_indexes(recency)(change-based discovery byupdated_at, restored);get_trigger_rulesparses only the baseline;get_scheduled_override_defs(account, index)returns(context, trigger, overrides)per default view that has overrides.merchandising_exporter.py:_build_scheduled_overrides_data(mirrors_build_profile_rules_data; sorts each trigger's entries ascending bypublish_atvia_override_publish_sort_key) and_resolve_override_rules(resolves a stored override's rules intoResolvedTriggerRulesvia the baseline parse path).export_indexpassesscheduled_overrides_datato the KV writer.models.py:ScheduledOverrideKVEntry(publish_at,unpublish_at,rules: ResolvedTriggerRules) andScheduledOverridesData(by_trigger,trigger_meta).cloudflare_kv_store_client.py:write_to_kv_store/_trigger_rules_dataacceptscheduled_overrides_*and embedscheduled_overrideson the trigger value + standalone emit (mirrorsprofile_rules).config.py/ infra:INDEX_UPDATE_RECENCY_MINUTES+ 5-minute EventBridge; exporter IAM role read-only on the table.
Worker (cloud_data_plane/components/cloudflare-worker/src/merchandising/rule-loader.ts):
ScheduledOverridetype;isOverrideActive(override, nowMs);selectActiveOverride(overrides, nowMs)(latestpublishAt; ties → later list index; short-circuits at the first future entry, relying on the exporter's ascendingpublishAtsort).- In
getRulesForTrigger, before A/B bucketing: ifrulesObject.scheduled_overrideshas an active entry, seteffectiveRulesObjectto the override'srules(preservingnew_boost_bury_rules/triggerContext/trigger). A/B selection then runs on top.
Console (components/console/src/): scheduledOverrides Zod type, getScheduleStatus, per-trigger status chip, and saveView mapping override pins/excludes to docId. Authoring flow:
- Create:
Publish ▸ Schedule Publish(dropdown on the Publish button →SchedulePublishDialog) captures the current editor rules as a scheduled override for a chosen window — with an optional name for the campaign — reverts the baseline to the last-published rules, and publishes. So "Schedule Publish" means "publish these rules during this window" while the everyday baseline is untouched. If the chosen window overlaps an existing scheduled rule on the trigger, the dialog shows a non-blocking warning naming the conflicting rule(s) (overlappingOverrides), so the author knows another rule will be active concurrently; the save is still allowed. - View / preview / remove: the
SchedulePanelside panel lists a Baseline (default rules) card plus the trigger's overrides (status chip + the override's name when set + local-zone window + rule count). Clicking a card loads that rule set into the pending view (loadScheduledOverride/viewBaselineRules) so the product gallery previews it, and the panel highlights the card currently shown (previewedOverrideIndex); the Baseline card returns the gallery to the default rules. While an override is previewed, an info banner under the search/collection box (inTriggerTab) names the override and its window, with a "View default rules" action to exit. The highlight clears when the user makes any other edit or reloads. A delete button removes an override. (The panel no longer creates overrides — that moved to the modal.) - The Schedule Publish action and panel are disabled/guarded for non-default profiles (
selectedProfileIdtruthy), matching the Controller's rejection.
Dependencies
- DynamoDB (
MERCHANDISING_TABLE): exporter reads#static(recency-filtered) + per-index rows. Read-only. - AWS Lambda + EventBridge: the existing 5-minute schedule (restored).
- Cloudflare KV: the worker reads the trigger value (already fetched today); overrides ride within it — no new key/lookup.
- Cloudflare worker: now a positive dependency — it performs the resolution. Constraint check:
Date.now()/Date.parseare available and already used in this worker (e.g. cache staleness), so timestamp resolution is safe.
Engineering Excellence
Consistency and Integrity
Per-request reference time. The worker computes Date.now() once per getRulesForTrigger call; a single request can't see two different active overrides.
Deterministic resolution under overlap. Overlapping windows are allowed, so more than one override can be active for a trigger at the same instant. The worker's resolution ("latest publishAt wins, ties by list order") makes this deterministic — it never throws and always yields a single rule set. selectActiveOverride (worker) and getScheduleStatus/isOverrideActive (Console) implement the same semantics, so what the UI shows matches what the worker applies. The Console warns the author at authoring time when a new window overlaps an existing one (overlappingOverrides/windowsOverlap in types.ts), but does not block the save.
Single writer to #static / KV. Only the Controller writes DynamoDB; only the exporter writes KV. The worker is read-only on KV. No write races introduced.
Stale-schedule window. The worker caches the KV value (overrides + windows) for up to its cache TTL (~5 min). Editing a schedule therefore has propagation latency (export + cache), but transitions within an already-propagated schedule are exact because the worker re-evaluates now per request against the cached value. This is the deliberate trade for not re-exporting on every boundary.
Reliability & Resilience
- No time-based export needed: a boundary is a request-time event in the worker, so a missed/late export never causes a missed transition — the schedule is already in KV.
- Per-index export failure is isolated (one async invocation per index) and retried on the next change/run.
- Boundary at request time: inclusive
publishAt/ exclusiveunpublishAt, deterministic. - Malformed override in KV:
selectActiveOverrideparses timestamps defensively; an entry with unparseable bounds simply won't match (treated as not-active) rather than throwing the request.
Scalability
Export load returns to the pre-scheduling baseline (change-based, 5-minute) — no per-minute full sync. The worker adds a per-request scan over the trigger's overrides; because the exporter ships them ascending by publish_at, the scan short-circuits at the first not-yet-published entry, so it touches only the active/past prefix rather than the whole list. The per-view override cap keeps even the worst case tiny. KV value size grows by the embedded overrides (bounded as above). No new bottleneck.
Observability
- The worker already logs the resolved rule set per request; an active override surfaces there (the applied
pin_rules/etc. are the override's). Consider adding a debug line naming the selected override window for traceability. - Exporter logs the indexes it re-exports; embedding is covered by existing per-index export result logs.
Security
- No posture change. Same authenticated Controller endpoints/permissions; exporter IAM read-only on the table; worker read-only on KV. Schedule timestamps are non-sensitive.
Testing
Following our testing approach: unit tests close to the logic; control time with freezegun (Python) and vi.useFakeTimers/vi.setSystemTime (worker).
Use-cases covered (implemented):
- Controller: override DDB/JSON round-trip; omission when none; window validation (publish≥unpublish, equal, non-ISO, neither-bound); overlapping windows accepted (overlapping and open-ended-overlap validate without raising); profile-view rejection (
ProfileMerchandisingViewRecord.validateraises with overrides, passes without;upsert_viewwithprofileId+ overrides → 400);upsert_viewpersists overrides with rules distinct from baseline; invalid window → 400. - Exporter:
get_updated_indexesrecency filter;get_trigger_rulesparses baseline only;get_scheduled_override_defsreturns overrides per default trigger (empty when none);_build_scheduled_overrides_dataresolves override rules, drops overrides finished in the past (keeps future/active/open-ended; drops a trigger whose overrides have all elapsed), sorts the exported entries ascending bypublish_at(absent sorts earliest; stable for equal keys), and marks standalone meta only when there's no default trigger entry;_trigger_rules_dataembedsscheduled_overrideson the default entry and emits a standalone entry for override-only triggers;write_to_kv_storepasses the data through. - Worker:
selectActiveOverridematrix (none active; window contains now; expired; latest-publishAtwins; tie → later list index; publishAt inclusive / unpublishAt exclusive; overlapping bounded windows; short-circuits at the first future entry given the ascending sort);getRulesForTriggerapplies the active override's rules (full replacement of baseline), falls back to baseline when no override is active, and uses the baseline when there are no overrides. - Console:
getScheduleStatusmatrix; schema round-trip;setViewScheduledOverridesreducer.
End-to-end verification
Seed a trigger (baseline R0) with an override R1 for a window; via the worker (fake clock or real time around the boundary) confirm searches apply R0 → R1 → R0 across the window, and that an unscheduled trigger is unaffected.
Key Risks
- Risk: stale-schedule propagation. A just-edited schedule isn't visible until export + worker-cache propagate (~up to 10 min). Mitigation: documented; acceptable since campaigns are planned ahead. Transitions within a live schedule are exact.
- Risk: profile-view overrides. Scheduled overrides on profile-scoped views are not supported. Mitigation (enforced): the Controller rejects them (
ProfileMerchandisingViewRecord.validate→ 400) and the Console hides the Schedule panel for non-default profiles, so they can't be authored or persisted. Adding profile-scoped resolution remains a possible follow-up if the use case arises. - Risk: A/B vs. override precedence surprise. When both are active, the A/B variant wins. Mitigation: documented; deterministic; revisit if a customer needs the opposite.
- Risk: clock/timezone confusion. Mitigation: timestamps are absolute UTC instants end to end, with the only local↔UTC conversion at the Console's pick/display boundary — see Timestamps and time zones; inclusive/exclusive bounds are tested.
Cost Analysis
Negligible. Export uses a change-based 5-minute cadence (no full sync), so Lambda/KV/SecretsManager volume matches the pre-scheduling baseline. The worker adds a tiny per-request CPU cost (bounded override scan); no extra KV reads. KV storage grows by the embedded overrides (bounded). No new infrastructure (NFR-3); no performance test required.
Release / Roll-out
Small effort; reuses existing infra.
- Phasing: Controller + exporter + worker can ship together (overrides are inert until authored); the Console enables authoring. No flag-day — a view with no overrides behaves exactly as today.
- Compatibility:
scheduledOverrides(storage) andscheduled_overrides(KV) are optional; older workers ignore the KV field and apply the baseline; older records lack it and resolve to baseline. - Rollback: revert worker, exporter, Controller, Console. A worker without the resolver simply applies the baseline (ignoring the embedded list); persisted overrides become inert. No data cleanup required.
Impact on other components
- Worker (
cloud_data_plane): newselectActiveOverride+ resolution ingetRulesForTrigger; reads the new KV field. The hot path now does a small per-request resolution. - Exporter: embeds
scheduled_overrides(new build + KV-embed); change-based 5-minute export restored; read-only. - Controller / Console: the model/validation and authoring UI (no change required by the worker-side resolution).
- Infrastructure (CDK): EventBridge back to 5 minutes;
INDEX_UPDATE_RECENCY_MINUTESrestored. - Analytics/reporting: KV values now contain a
scheduled_overrideslist (additive, ignorable).
Alternative Solutions Considered
A. Embed all overrides; resolve in the worker at request time (chosen)
Exporter embeds the full schedule under the trigger KV entry; the worker picks the active override per request. Reuses the A/B-variant runtime-selection pattern; exact activation timing; change-based export; exporter/worker stay read-only. Chosen.
B. Resolve at export time
The exporter writes only the currently-active rule set to KV and re-exports frequently so the active set stays fresh. Rejected: to keep timing tight it forces frequent full re-exports of every index (high steady-state Lambda/KV cost), and activation is bounded by the export cadence rather than exact. A puts the (cheap) selection where the request already is.
C. Dedicated scheduler (EventBridge one-time schedules / Step Functions)
Fire exactly at each boundary to flip state. Rejected: adds AWS resources and per-edit schedule lifecycle management; worker-side resolution already gives exact timing with no new infra.
| Criteria | A (chosen) | B (export-time) | C (scheduler) |
|---|---|---|---|
| Activation timing | Exact (per request) | ~export cadence | Exact |
| Steady-state load | Low (change-based) | High (frequent full sync) | Low |
| Search-time cost | Small per-request scan | Zero | Zero |
| New infra | None | None | EventBridge/Step Functions |
FAQs
Q: I have live rules and want to publish a different set for a window. What happens to my current rules? A: They stay live as the baseline. Add an override with the campaign rules for the window; before/after the window the baseline applies, during it the override wins. No go-dark, no overwrite.
Q: How quickly does an override go live? A: Exactly at its publish time, provided the schedule is already in KV. Editing a schedule has propagation latency (export ~5 min + worker cache ~5 min) before it's in KV; after that, transitions are instant.
Q: Does this slow down customer search? A: It adds a tiny per-request resolution (scan the trigger's overrides, parse two timestamps) — no extra KV lookup. Negligible.
Q: Can two overrides on the same trigger overlap?
A: Yes — overlapping windows are allowed. When more than one override is active at the same instant the worker resolves deterministically to the later publishAt (ties by list order). The Console shows a non-blocking warning when an author schedules a window that overlaps an existing one, so the overlap is intentional rather than accidental.
Q: A scheduled override and an A/B test both apply — which wins? A: The override sets the base; the A/B variant runs on top and wins for bucketed users.
Q: Can a user schedule an override in the past?
A: A past publishAt means "already publishing"; a past unpublishAt means "already expired" (never wins). publish >= unpublish and neither-bound are rejected (FR-9).
References
- A/B Test Runtime Design Doc —
docs/ab-test-design.md(the runtime variant-selection pattern this mirrors). - Cloud Control Plane Development Guide —
CLAUDE.md. - Implementation:
components/controller/merchandise/(Controller);components/merchandising_exporter/(exporter);cloud_data_plane/components/cloudflare-worker/src/merchandising/rule-loader.ts(worker);components/console/src/...(Console). - Marqo Design Template (Notion).
Appendix
Resolution (authoritative — implemented in the worker)
window_contains(o, now) = (o.publishAt is None or now >= o.publishAt) # inclusive
and (o.unpublishAt is None or now < o.unpublishAt) # exclusive
selectActiveOverride(overrides, now):
active = [o for o in overrides if window_contains(o, now)]
if active:
return max(active, key=(publishAt or -inf, list_index)) # latest publishAt; ties → later
return null # → caller applies the baseline rules
The worker's selectActiveOverride/isOverrideActive and the Console's getScheduleStatus/isOverrideActive implement the same semantics.
Why resolve in the worker rather than the exporter
Export-time resolution (alternative B) means the KV only ever holds the current slice of the schedule, so keeping activation tight requires frequent re-exports of every index. Worker-side resolution stores the whole schedule once and evaluates the current time where the request already is — exact timing, change-based export, and it reuses the worker's existing A/B-variant selection path.