DynamoDB Access-Patterns Cheat Sheet
Read the access pattern before hand-rolling a query. Never scan a prod
table to find one item — every load-bearing lookup below already has a key or
GSI path, and the repository code is the authoritative source for the key
format. Querying-by-guess is how a recent revco resync burned four wrong
aws dynamodb invocations and one full-table scan before someone read
get_shop_job_by_id and found the GSI_JobLookup_v2 path that had existed all
along.
This page covers only the load-bearing tables (the long tail goes stale and nobody trusts it). For the generic CLI primitives (list-tables, describe-table, table naming by env) see resources/dynamodb.md. For the ecommerce data-flow context see components/ecommerce.md.
All snippets are read-only (query / get-item, never put/update/
delete) and use --profile controller --region us-east-1 against prod. Swap
the prod- prefix for staging- / dev-<branch>- per
resources/dynamodb.md.
Deserializing output without boto3
The controller box's system python3 has no boto3, so you cannot rely on
TypeDeserializer. Either run through pants, or pipe the raw DynamoDB-JSON
through a small hand-rolled deserializer that unwraps the single-key type
descriptors ({"S": ...}, {"N": ...}, {"M": ...}, {"L": ...}):
aws dynamodb query ... --profile controller --region us-east-1 --output json \
| python3 -c '
import sys, json
def und(v):
(t, x), = v.items()
if t == "S": return x
if t == "N": return float(x) if ("." in x or "e" in x.lower()) else int(x)
if t == "BOOL": return x
if t == "NULL": return None
if t == "M": return {k: und(w) for k, w in x.items()}
if t == "L": return [und(w) for w in x]
if t in ("SS", "NS"): return x
return x
data = json.load(sys.stdin)
for it in data.get("Items", []):
print(json.dumps({k: und(v) for k, v in it.items()}))
'
The same und() helper works on a get-item response by iterating over
data["Item"].items() instead of data["Items"].
1. Indexer / sync jobs — prod-EcomIndexerJobsTable
Tracks every bulk, webhook, and incremental sync job.
Source of truth:
sync_job_repository.py
(GSI names sync_job_repository.py:23-24, pk/sk construction
sync_job_repository.py:104-117),
sync_job_core_model.py
(schema sync_job_core_model.py:35-51), and the table/GSI definition in
ecom_stack.py
(setup_indexer_jobs_table, ecom_stack.py:341-406).
Key schema
| Key | Format | Notes |
|---|---|---|
| pk (S) | PLATFORM#{platform}#SHOP#{shop_id} | platform is the enum value (shopify, ecom, …). |
| sk (S) | JOB#{created_at}#{job_id} | created_at is an ISO-8601 timestamp; leading component, so a time window maps onto a sort-key range. |
| TTL | ttl | Unix ts, 30-day auto-cleanup. |
⚠️ The pk trap.
shop_idis the composite index name{system_account_id}-{index_name}(e.g.kl7a9h55-shopify-e513a8-4), not the shop domaine513a8-4.myshopify.com— even though the admin console shows the shop domain. See theshop_idfield description,sync_job_core_model.py:49-51. If your pk-keyed query returns nothing, check that you used the index name, not the domain.
GSIs
| GSI | pk | sk | Projection | Use |
|---|---|---|---|---|
| GSI_JobLookup_v2 | job_id (bare) | created_at | KEYS_ONLY | "I have a job id, find the row." Returns only pk/sk/job_id/created_at — you must then get-item the base row. |
| GSI_JobsByStatus_v2 | shop_id | status_created_at ({status}#{created_at}) | INCLUDE (listing attrs) | "Find active/pending jobs for a shop." |
GSI_JobsByStatus_v2 partitions on shop_id alone, so a shop on multiple
platforms can return jobs from all of them — add a platform = :platform
filter when it matters (see list_shop_jobs_by_status,
sync_job_repository.py:890-1049).
How to look up X
- I have a job id → the row: query
GSI_JobLookup_v2onjob_id, thenget-itemthe base row by the returned pk/sk. (Mirrorsget_shop_job_by_id,sync_job_repository.py:831-888.) - Active / pending jobs for a shop: query
GSI_JobsByStatus_v2onshop_id+begins_with(status_created_at, "PENDING#")(or"IN_PROGRESS#"). - Latest N jobs for an index: query the base table on the pk with
ScanIndexForward=false,Limit=N.
Canonical CLI
Lookup by job id (two steps — the GSI is KEYS_ONLY):
# 1) Find the base-table key for the job id.
aws dynamodb query \
--table-name prod-EcomIndexerJobsTable \
--index-name GSI_JobLookup_v2 \
--key-condition-expression "job_id = :jid" \
--expression-attribute-values '{":jid": {"S": "<job-id>"}}' \
--profile controller --region us-east-1
# 2) Hydrate the full row using the pk/sk from step 1.
aws dynamodb get-item \
--table-name prod-EcomIndexerJobsTable \
--key '{"pk": {"S": "PLATFORM#shopify#SHOP#kl7a9h55-shopify-e513a8-4"}, "sk": {"S": "JOB#<created_at>#<job-id>"}}' \
--profile controller --region us-east-1
List by status for a shop (newest first):
aws dynamodb query \
--table-name prod-EcomIndexerJobsTable \
--index-name GSI_JobsByStatus_v2 \
--key-condition-expression "shop_id = :sid AND begins_with(status_created_at, :sp)" \
--expression-attribute-values '{":sid": {"S": "kl7a9h55-shopify-e513a8-4"}, ":sp": {"S": "IN_PROGRESS#"}}' \
--no-scan-index-forward \
--profile controller --region us-east-1
Latest N jobs for an index (base table, newest first):
aws dynamodb query \
--table-name prod-EcomIndexerJobsTable \
--key-condition-expression "pk = :pk AND begins_with(sk, :skp)" \
--expression-attribute-values '{":pk": {"S": "PLATFORM#shopify#SHOP#kl7a9h55-shopify-e513a8-4"}, ":skp": {"S": "JOB#"}}' \
--no-scan-index-forward --max-items 10 \
--profile controller --region us-east-1
2. Index settings — prod-EcomIndexSettingsTable
One unified config record per index (create_index / search / collections /
add_docs settings), plus sub-records (saved queries, search profiles, agentic
config) that share the INDEX# prefix.
Source of truth:
index_settings_repository.py
(pk/sk + sub-record SK shapes, index_settings_repository.py:39-50,
106-114, 163-164, 509-534; DEFAULT_CONFIGS key
index_settings_repository.py:47-50),
index_settings_model.py
(schema index_settings_model.py:928-945), and
ecom_stack.py (setup_index_settings_table,
ecom_stack.py:408-460).
Key schema
| Key | Format | Example |
|---|---|---|
| pk (S) | {system_account_id} | kl7a9h55 |
| sk (S) | INDEX#{index_name} (root record — exactly one #) | INDEX#shopify-e513a8-4 |
| sk (S) | INDEX#DEFAULT_CONFIGS | account-level default config record |
| sk (S) | INDEX#{index_name}#PROFILE#SEARCH#{profile} | INDEX#products#PROFILE#SEARCH#default |
Root records carry exactly one # in the SK; every sub-record shape
(#QUERY#…, #PROFILE#SEARCH#…, #AGENTIC_CONFIG, …) appends further #
segments. See _exclude_non_index_settings_items,
index_settings_repository.py:509-534.
GSIs
| GSI | pk | sk | Use |
|---|---|---|---|
| GSI_IndexRootsByName | pk | index_name | Sparse — only root records have index_name. List index roots without scanning sub-records. |
| GSI_ShopifyDomain | shopify_domain | sk | Resolve a Shopify domain → index settings (webhook routing). Sparse: only rows with shopify_domain set. |
How to look up X
- An index's full config:
get-itemby pk =system_account_id, sk =INDEX#{index_name}. (Mirrorsget_index_config,index_settings_repository.py:150-173.) - Just the
add_docs_config: sameget-item, with a projection onadd_docs_config. - The account-level defaults:
get-itemsk =INDEX#DEFAULT_CONFIGS.
Canonical CLI
Get an index's add_docs_config:
aws dynamodb get-item \
--table-name prod-EcomIndexSettingsTable \
--key '{"pk": {"S": "kl7a9h55"}, "sk": {"S": "INDEX#shopify-e513a8-4"}}' \
--projection-expression "add_docs_config" \
--profile controller --region us-east-1
Drop --projection-expression for the whole record. To list every config
record for an account, query the pk with begins_with(sk, "INDEX#").
3. Shopify entities — prod-ShopifyEntitiesTable
Single-table store for per-shop OAuth sessions, API keys, and settings. This is where scripts retrieve the Shopify Admin API access token.
Source of truth:
shopify_entities.py
(entity models + SK shapes, shopify_entities.py:15-50),
database.py
(key prefixes, database.py:8-42),
shopify_graphql.py (token retrieval,
get_shopify_access_token, shopify_graphql.py:57-113), and
shopify_admin_stack.py
(setup_shopify_entities_table, shopify_admin_stack.py:828-876).
Key schema
| Key | Format | entity_type |
|---|---|---|
| pk (S) | SHOP#{shop_domain} | — (e.g. SHOP#cool-store.myshopify.com) |
| sk (S) | USER#{user_id} | SESSION (OAuth session, has access_token) |
| sk (S) | API_KEY | API_KEY |
| sk (S) | SETTINGS | SETTINGS (UI components, active_index, system_account_id) |
Here the pk is the shop domain (contrast with the jobs table, which keys on
the composite index name). Sessions support multiple users per shop, so there
can be several USER#… rows.
GSI
| GSI | pk | sk | Use |
|---|---|---|---|
| GSI_SystemAccountId | system_account_id | sk | Sparse — only SETTINGS rows that carry system_account_id. List all shops for a Marqo account. |
How to look up X
- The Shopify access token for a shop: query the pk with
begins_with(sk, "USER#")andentity_type = SESSION, then pick the session with the latestlast_updatedand readaccess_token. (This is exactlyget_shopify_access_token,shopify_graphql.py:57-113— prefer running that script over hand-rolling, since it handles AWS SSO and session selection.) - A shop's settings / active index:
get-itemsk =SETTINGS. - All shops for an account: query
GSI_SystemAccountIdonsystem_account_id.
Canonical CLI
Find sessions (and their access tokens) for a shop:
aws dynamodb query \
--table-name prod-ShopifyEntitiesTable \
--key-condition-expression "pk = :pk AND begins_with(sk, :skp)" \
--filter-expression "entity_type = :et" \
--expression-attribute-values '{":pk": {"S": "SHOP#cool-store.myshopify.com"}, ":skp": {"S": "USER#"}, ":et": {"S": "SESSION"}}' \
--profile controller --region us-east-1
Pick the row with the latest last_updated and read access_token. For the
full flow (SSO + selection) prefer:
python scripts/ecom/shopify_graphql.py \
--shop cool-store.myshopify.com --env prod --profile controller \
--query '{ shop { name } }'
Get a shop's settings record:
aws dynamodb get-item \
--table-name prod-ShopifyEntitiesTable \
--key '{"pk": {"S": "SHOP#cool-store.myshopify.com"}, "sk": {"S": "SETTINGS"}}' \
--profile controller --region us-east-1