Skip to main content

Hippodrome

Hippodrome is the development environment orchestrator for running the entire Cloud Control Plane stack locally. The name comes from the ancient Greek horse-racing stadium, where all the racing action happened in one unified location.

Quick Start

# Start core services
pants hd up

# Start with e-commerce services (admin_server, search_proxy)
pants hd up --profile ecom

This starts:

With --profile ecom, additional services are started:

The dashboard provides real-time status updates and aggregated logs from all services.

Service Ports

ServicePortProfile
Dashboard9000All
fake_cell9001core (when cell=local)
controller9002core
console9008core
admin_server9004ecom
search_proxy9005ecom
ecom_settings_exporter9010ecom
merchandising_exporter9011ecom
ecom_ingest9019ecom
ecom_indexer_service9018ecom

Profiles

Use the --profile flag to select which services to start:

ProfileServicesUse Case
core (default)fake_cell, controller, consoleBasic control plane development
ecomcore + admin_server, search_proxyE-commerce development
fullAll servicesFull stack testing

Ecom Ingest Store

ecom_ingest defaults to an in-memory store for local Hippodrome runs. To run it against a local PostgreSQL/RDS-compatible database:

HIPPODROME_ECOM_INGEST_STORE=postgres \
HIPPODROME_ECOM_INGEST_DATABASE_URL='postgres://postgres:postgres@127.0.0.1:5432/ecom_ingest?sslmode=disable' \
pants hd up --profile ecom

Cell Connection

Use --cell to connect to deployed cells instead of fake_cell:

# Connect to staging cell (skips fake_cell)
pants hd up --profile ecom --cell staging

# Connect to production cell (WARNING: real data!)
pants hd up --profile ecom --cell prod

Table Prefix

Use --table-prefix to customize DynamoDB table names. Default: dev-{git-branch}-

# Use custom table prefix
pants hd up --profile ecom --table-prefix my-feature-

Service Communication

Core Profile:

┌─────────────┐ ┌─────────────┐
│ controller │────▶│ fake_cell │
│ :9002 │ │ :9001 │
└─────────────┘ └─────────────┘


┌──────┴──────┐
│ console │
│ :9008 │
└─────────────┘

E-commerce Profile (--profile ecom):

┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│search_proxy │────▶│admin_server │────▶│ fake_cell │
│ :9005 │ │ :9004 │ │ :9001 │
└─────────────┘ └─────────────┘ └─────────────┘

┌──────┴──────┐
│ controller │
│ :9002 │
└─────────────┘

The controller communicates with fake_cell via the CONTROL_PLANE_URL_OVERRIDE environment variable, which is set automatically by the orchestrator based on the --cell flag.

For the local-only interfaces that replace Cloudflare service bindings and AWS infrastructure, see docs/components/hippodrome/local-stack-contract.md.

First-Time Setup

The orchestrator handles most setup automatically:

  • Creates controller's Python venv if missing
  • Installs controller requirements if needed
  • Runs npm ci for console if node_modules/ is missing

No manual setup is required beyond having Python 3.11+, Node.js, and npm installed.

Hot Reload

All services run with hot reload enabled:

  • Python services: Use uvicorn's --reload or Django's auto-reloader
  • Console: Uses React's Hot Module Replacement

Changes to source files are detected and services restart automatically.

Graceful Shutdown

Press Ctrl+C to stop all services. The orchestrator sends SIGTERM, waits up to 5 seconds, then SIGKILL if needed.

Running E2E Tests

Hippodrome runs everything locally using moto (AWS emulation), fake_cell, and fake_cognito. No remote credentials are required.

Quick start

# Terminal 1: Start Hippodrome
source components/hippodrome/.env.e2e
pants hd up --profile ecom

# Terminal 2: Run E2E tests
source components/hippodrome/.env.e2e
pants test //components/shopify/e2e_tests/e2e_tests/tests/ecom_onboarding_test.py -- -v

The .env.e2e file sets three env vars:

  • ENVIRONMENT=hippodrome — tells the test config loader to use local service URLs and dummy AWS creds
  • HIPPODROME_RANDOM_SEED=42 — deterministic seed data
  • PANTS_CONCURRENT=True — allows concurrent pants processes (also handled by pants hd)

What can run locally

TestRuns locally?Notes
ecom_onboarding_test.pyYesCore onboarding flow
ecom_onboarding_text_test.pyYesText-based variant
ecom_import_export_test.pyYesImport/export workflows
ecom_aliasing_test.pyYesIndex aliasing
ecom_analytics_test.pyYesAnalytics tracking
index_forking_test.pyYesIndex fork workflow
shopify_onboarding_test.pySkipsNeeds real Shopify credentials

Hippothesis smoke tests

Hippothesis is a lightweight probe suite. No env file needed — it defaults to local environment.

# Service health probes
pants test //components/hippothesis/hippothesis:tests

# Full suite (includes property-based search tests)
pants test //components/hippothesis::

How it works

In hippodrome mode, all external dependencies are emulated locally:

ServicePortEmulates
moto9003DynamoDB, S3, SQS, SNS, Secrets Manager, Step Functions
fake_cognito9012Cognito token validation
fake_cell9001Marqo data plane

The E2E test config loader (components/shopify/e2e_tests/e2e_tests/config/config.py) reads components/shopify/e2e_tests/.env.hippodrome which sets:

  • Service URLs to localhost ports
  • AWS credentials to testing/testing (for moto)
  • AWS_ENDPOINT_URL to the local moto server

Shopify and Cloudflare credentials are optional in hippodrome mode — tests that need them skip at the fixture level rather than failing at config load.

global_worker Setup (External Repository)

The global_worker is a Cloudflare Worker that handles search query routing, merchandising rules, and caching. It lives in a separate repository and must be set up manually for full e-commerce search functionality.

Why global_worker is Needed

When running with --profile ecom, the search flow is:

search_proxy (:9005) → global_worker (:9012) → fake_cell (:9001)
  • search_proxy receives incoming search API requests
  • global_worker applies merchandising rules and caching
  • global_worker proxies the final request to the cell

Without global_worker, search requests will fail at the search_proxy level.

Quick Setup

  1. Clone the repository (outside this repo):

    cd ~/dev
    git clone git@github.com:marqo-ai/global-worker.git
    cd global-worker
    npm install
  2. Create local configuration (wrangler.local.toml):

    name = "local-global-worker"
    main = "src/index.ts"
    compatibility_date = "2024-09-23"
    compatibility_flags = ["nodejs_compat"]

    [vars]
    ENV = "dev"
    FULL_ENV = "dev-local"
    CELL_URL = "http://localhost:9001"

    [dev]
    port = 9012
    local_protocol = "http"
  3. Start the worker:

    npx wrangler dev --config wrangler.local.toml --port 9012

Running Without global_worker

If you only need admin operations (not search), you can skip global_worker:

  • admin_server API calls will work
  • Controller operations will work
  • Only search requests through search_proxy will fail

EventBridge Integration

The hippodrome orchestrator includes a webhook endpoint for receiving EventBridge events and routing them to local services. This enables local testing of event-driven services like ecom_settings_exporter and merchandising_exporter.

Architecture

In production, DynamoDB Streams trigger EventBridge Pipes which publish events to EventBridge. Local services subscribe to EventBridge rules. For local development, the orchestrator provides a webhook that simulates this flow:

Production:
DynamoDB Table → DynamoDB Stream → EventBridge Pipe → EventBridge Bus → Service

Local Development:
AWS EventBridge → Webhook → Orchestrator (:9000) → Local Service (:9010/:9011)

└── POST /webhook/eventbridge

Webhook Endpoint

The orchestrator exposes a webhook at POST http://localhost:9000/webhook/eventbridge that accepts EventBridge events and routes them to local services.

Event Format:

{
"source": "marqo.dynamodb",
"detail-type": "EcomIndexSettings.MODIFY",
"detail": {
"eventName": "MODIFY",
"tableName": "dev-main-EcomIndexSettingsTable",
"keys": {
"pk": {"S": "INDEX#abc123"},
"sk": {"S": "SETTINGS"}
},
"newImage": { ... },
"oldImage": { ... }
}
}

Event Routing:

Detail-Type PrefixTarget ServicePortEndpoint
EcomIndexSettings.*ecom_settings_exporter9010/events
Merchandising.*merchandising_exporter9011/events

Manual Testing

Test the webhook endpoint directly with curl:

# Test routing to ecom_settings_exporter (port 9010)
curl -X POST http://localhost:9000/webhook/eventbridge \
-H "Content-Type: application/json" \
-d '{
"source": "marqo.dynamodb",
"detail-type": "EcomIndexSettings.MODIFY",
"detail": {
"eventName": "MODIFY",
"tableName": "dev-main-EcomIndexSettingsTable",
"keys": {"pk": {"S": "INDEX#test"}, "sk": {"S": "SETTINGS"}},
"newImage": {"pk": {"S": "INDEX#test"}}
}
}'

# Test routing to merchandising_exporter (port 9011)
curl -X POST http://localhost:9000/webhook/eventbridge \
-H "Content-Type: application/json" \
-d '{
"source": "marqo.dynamodb",
"detail-type": "Merchandising.INSERT",
"detail": {
"eventName": "INSERT",
"tableName": "dev-main-MerchandisingTable",
"keys": {"pk": {"S": "RULE#123"}},
"newImage": {"pk": {"S": "RULE#123"}}
}
}'

Response format (success):

{
"webhook_status": "forwarded",
"target": "localhost:9010",
"service_response": {"status": "success", "result": ...}
}

Response format (unrouted):

{
"status": "ignored",
"reason": "No route configured for detail-type: Unknown.Event"
}

Automatic Tunnel Setup

The --enable-events flag starts a tunnel to expose the local webhook for EventBridge events:

pants hd up --profile ecom --enable-events

This flag will:

  1. Start a tunnel (prefers cloudflared, falls back to ngrok) to expose the local webhook
  2. Create a temporary EventBridge rule that forwards events for your table prefix to the tunnel
  3. Print the public webhook URL and rule name at startup
  4. Clean up the rule and tunnel when the orchestrator stops

Requirements:

  • Install cloudflared (recommended, free, no account required)
  • Or install ngrok (requires free account)
  • AWS credentials configured with EventBridge and IAM permissions

What gets created:

  • An EventBridge rule named local-stack-webhook-{id} that filters events for your --table-prefix
  • An API destination and connection to forward events to your tunnel
  • An IAM role local-stack-eventbridge-api-dest-role (created once, reused)

All resources except the IAM role are automatically cleaned up on shutdown. The rule description indicates it's safe to delete manually if needed.

Validating EventBridge Infrastructure

After deploying the CDK infrastructure (ecom and controller stacks), validate that events are flowing correctly:

Option 1: Use the validation script

# Validate infrastructure for your table prefix
python components/hippodrome/scripts/validate_eventbridge.py --table-prefix dev-main-

# With verbose output
python components/hippodrome/scripts/validate_eventbridge.py --table-prefix dev-main- --verbose

The script checks:

  • Event buses exist (EcomEventBus, MerchandisingEventBus)
  • EventBridge Pipes exist and are RUNNING
  • DynamoDB tables have streaming enabled

Option 2: Validate manually in AWS Console

  1. Check Event Buses:

    • Go to EventBridge Console
    • Navigate to "Event buses" → "Custom event buses"
    • Verify {table-prefix}EcomEventBus and {table-prefix}MerchandisingEventBus exist
  2. Check EventBridge Pipes:

    • Go to EventBridge Pipes
    • Verify {table-prefix}IndexSettingsEventPipe and {table-prefix}MerchandisingEventPipe exist
    • Confirm status is "Running"
  3. Check DynamoDB Streams:

    • Go to DynamoDB Console
    • Open {table-prefix}EcomIndexSettingsTable → "Exports and streams"
    • Confirm DynamoDB Stream is enabled (NEW_IMAGE or NEW_AND_OLD_IMAGES)
    • Repeat for {table-prefix}MerchandisingTable

Option 3: Test event flow end-to-end

  1. Start hippodrome with --enable-events
  2. In AWS Console, make a change to a DynamoDB table:
    aws dynamodb put-item \
    --table-name dev-main-EcomIndexSettingsTable \
    --item '{"pk": {"S": "test-account"}, "sk": {"S": "INDEX#test"}}'
  3. Watch the hippodrome logs for incoming events
  4. Clean up:
    aws dynamodb delete-item \
    --table-name dev-main-EcomIndexSettingsTable \
    --key '{"pk": {"S": "test-account"}, "sk": {"S": "INDEX#test"}}'

Troubleshooting

See troubleshooting.md for common issues and solutions.

Common issues:

  • Port already in use: Kill the process using lsof -ti :PORT | xargs kill -9
  • Services stuck waiting: Use pants hd (handles --concurrent automatically)
  • Django module errors: The venv is set up automatically; check controller's .venv directory

CLI Reference

pants hd up [OPTIONS]

Options:

  • --profile [core|ecom|full] - Service profile to run (default: core)

    • core: fake_cell, controller, console
    • ecom: core + admin_server, search_proxy, e-commerce services
    • full: All available services
  • --cell [local|staging|prod] - Cell to connect to (default: local)

    • local: Uses fake_cell for local development
    • staging: Connects to deployed staging cell (skips fake_cell)
    • prod: Connects to production cell (WARNING: real data!)
  • --table-prefix PREFIX - DynamoDB table name prefix (default: dev-{git-branch}-)

  • --enable-events - Enable EventBridge webhook integration with automatic tunnel setup

  • --project-root PATH - Project root directory (auto-detected if not specified)

Requirements

  • Python 3.11+
  • Node.js and npm (for console)
  • Pants build system
  • AWS credentials (when using --cell staging or --cell prod)

Architecture

For detailed architecture, service configuration, and development guidelines, see AGENTS.md.