Skip to main content

Index Aliasing

Aliases route document writes, search reads, or analytics to different indexes without changing customer-facing URLs or API keys.

Alias Types

TypeFieldPurposeTargets
Writeindex_aliases.doc_writesFan out document writes to multiple indexesMultiple, each with internal flag
Readindex_aliases.doc_readsRoute search/collection requests to a different indexMultiple, weighted (sum to 1.0)
Analyticsindex_aliases.analyticsRoute analytics to an alternate indexSingle target

All three live under index_aliases on the IndexSettings DDB record.

Functional Requirements

  • Customers can perform all operations against a stable index ID (i.e. system account and index name), even as we change the real index (i.e. different names) behind the scenes.
  • Index settings records that alias to other indexes must be protected from deletion as much as any real index, whether or not they correspond to a real index with the same name.

Per feature requirements:

  • Doc reads
    • If there is a single read alias, all retrieval requests are served from the target of that read alias.
    • If there are multiple read aliases, one is selected per retrieval request and response is served by the target index of that alias.
      • Each alias entry is either enabled or disabled. Only enabled aliases are considered.
      • Each alias entry has a weight. The likelihood an alias is selected is its weight relative to the sum of all enabled alias weights.
      • Aliases may not have negative weights.
      • For any fixed set of aliases and weights, the same user should be served by the same target on subsequent requests.
    • If there are no enabled read aliases, the default behaviour is to simply read from the underlying index with the same name. If there is no such index, the read is an error (404).
  • Doc writes
    • Each write to an index with a doc_writes alias is replicated to each of the doc_writes target indexes. A separate job is created for each.
    • The write response contains a single job ID, regardless of how many jobs were created. That ID is for the job that is targeting the index with the highest weight for read traffic.
    • The customer can always fetch that response job's details by ID so long as the job exists, even if the backing index has changed.
    • The customer does not see jobs in their jobs lists (Console and API) that were not returned for a write request (marked internal).
    • Jobs created by forked writes should reference each other so that different outcomes can be reconciled more easily.
  • Metrics
    • Ecom API-level metrics (e.g. request counts, latencies, error rates) are associated with BOTH:
      • the index ID the user requested, for reporting to customers
      • the index ID the request was ultimately served by, for monitoring real index health
    • Ecom indexer metrics (job and doc counts and latencies) are labelled with the index ID that was written to
      • Fanned-out write jobs are labelled internal to filter out for customer reporting
    • Actual index (i.e. DP) metrics are associated with the actual index.
  • Merchandising
    • Each write to a merchandising rule is replicated to each of the index's write aliases
      • This could be tricky, since merchandising is a level below ecom, where aliases live
      • Would be easier if merchandising writes went through search_proxy (proxied to controller), since search_proxy has alias config and could fan out writes to controller
    • OR merchandising rules are resolved via alias at the ecom layer and passed through to global worker (better but more refactoring since merch in global worker has been somewhat optimised already)
    • OR as a short-term hack, just automate syncing merchandising rules from one index to another

Data Model

IndexAliases
├── doc_writes: DocWritesAlias
│ ├── enabled: bool
│ └── targets: [WriteAliasTarget]
│ ├── index_id: "{system_account_id}-{index_name}"
│ └── internal: bool
├── doc_reads: DocReadsAlias
│ ├── enabled: bool
│ └── targets: [DocReadsAliasTarget]
│ ├── index_id: str
│ ├── weight: float (0.0–1.0)
│ └── internal: bool
└── analytics: AnalyticsAlias
├── enabled: bool
├── index_id: str | null
└── internal: bool

Source: components/ecom_utils/ecom_utils/index_settings_service/index_settings_model.py

Legacy Fields

Two deprecated fields still work and take precedence when present:

Legacy FieldNew EquivalentPrecedence
add_docs_config.index_write_aliasesindex_aliases.doc_writesLegacy wins
read_aliasindex_aliases.doc_readsLegacy wins

No legacy equivalent exists for analytics.

Propagation Pipeline

DDB (IndexSettings)
→ DDB Stream (10s batch window)
→ Settings Exporter Lambda
→ Cloudflare KV
→ Search Proxy (reads on next request)

Total propagation: typically <60s, worst case ~150s.

  • Write aliases are resolved by the sync service at write time (not via KV).
  • Read aliases propagate through KV to the search proxy.
  • Analytics aliases propagate through KV to the search proxy.
  • The exporter omits doc_writes from KV (not needed by search proxy).

Key Behaviors

Enable/disable toggle: Setting enabled=false instantly stops routing without losing the target configuration. Re-enabling restores routing. This is the primary rollback mechanism.

Deterministic read routing: The search proxy uses FNV-1a hash bucketing on userId or sessionId to select a weighted target. The same user always hits the same target during gradual traffic shifts.

Internal flag: Targets with internal=true are only considered when the x-marqo-internal header is present. Used to hide alias-created jobs from customer views.

Write alias auto-include: When adding the first write target via the admin API, the source index is automatically added as a target. Forgetting this is a common mistake in manual DDB edits.

Admin API

Alias CRUD is managed through the admin lambda, not the ecom API.

Base path: /api/v1/admin/accounts/{system_account_id}/indexes/{index_name}/aliases/

MethodEndpointDescription
POST/write-targetsAdd write target
DELETE/write-targetsRemove write target
PATCH/write-enabledToggle write aliasing
POST/read-targetsAdd read target (with weight)
DELETE/read-targetsRemove read target
PATCH/read-enabledToggle read aliasing
PUT/analytics-targetSet analytics target
DELETE/analytics-targetClear analytics target
PATCH/analytics-enabledToggle analytics aliasing

Source: components/admin_lambda/admin_lambda/routes/alias_routes.py

Validation Rules

  • Target index must exist
  • Target must be owned by the same organization (email domain match) or by Marqo
  • No transitive aliases (A→B→C rejected)
  • Analytics target must be in the same account (API key scoping)

Source: components/admin_lambda/admin_lambda/services/alias_validation_service.py

Common Operations

Set up write fan-out (preferred method): Use the admin API POST /write-targets to add each target. The source index is auto-included on the first call. See also Forking Writes for the legacy DDB-edit approach.

Cut over reads to a new index:

  1. Add write target (new index) — both indexes receive writes
  2. Backfill the new index if needed
  3. Add read target (new index) with weight 0.0, then gradually increase
  4. When weight=1.0, remove the old read target

Emergency rollback: PATCH /read-enabled with {"enabled": false} — takes effect within ~60s.