Skip to main content

Deployment

Local Development

# Start ClickHouse
polo up
curl http://localhost:8123/ping # Should return "Ok."

# Apply schema
polo migrate

# Seed test data
polo seed

# Dev UI
cd components/ui && npm run dev

# Dev Worker
cd components/api && npx wrangler dev

ClickHouse

Options

OptionOps burdenCostLatencyRecommendation
Self-hosted on EC2MediumCovered by savings plan~50msDefault choice
ClickHouse Cloud (serverless)Zero~$50-150/mo~100msNot recommended (see below)
ClickHouse Cloud (dedicated)Low~$200/mo~20msNot recommended (see below)

Why self-hosted: EC2 is covered by our existing savings plan, making it effectively free. More importantly, Polo processes potentially sensitive infrastructure metadata including customer resource details — keeping ClickHouse within our VPC avoids shipping that data to a third party, which is better for both security and commercial risk. The collectors connect via HTTPS and don't care which deployment model is used, so migrating later is straightforward if needed.

CDK Stack

The ClickHouseStack (in infra/polo/stacks/clickhouse_stack.py) provisions:

  • VPC with public + private subnets (1 NAT gateway)
  • EC2 instance (m6i.large, Amazon Linux 2023, 100 GB gp3 EBS)
  • Security group — ingress rules added by dependent stacks (Lambda SGs, Cloudflare Tunnel)
  • IAM role with SSM managed instance access (no SSH)
  • Secrets Managerpolo_api and polo_ingest passwords (auto-generated)
  • User data installs ClickHouse 24.3, configures TLS (self-signed), creates database users

Security (production)

  • Private subnet with no public IP
  • Self-signed TLS (CA + server cert generated at launch, SAN includes instance private IP)
  • HTTPS on port 8443 (in addition to HTTP 8123 for local access)
  • Dedicated polo_api user with read-only access to the polo database (for the Worker)
  • Dedicated polo_ingest user with insert access (for collector Lambdas)
  • Credentials stored in Secrets Manager
  • Accept connections only from Worker's IP range or via Cloudflare Tunnel

Post-Deploy Steps

After deploying the ClickHouse stack for the first time:

  1. Run schema migrations via SSM port forwarding:

    # Retrieves ingest credentials from Secrets Manager automatically
    ENV=staging AWS_PROFILE=polo polo migrate-remote <instance-id>
  2. Set Worker secrets (CLICKHOUSE_HOST and CLICKHOUSE_PASSWORD):

    wrangler secret put CLICKHOUSE_HOST --env staging # https://<private-ip>:8443
    wrangler secret put CLICKHOUSE_PASSWORD --env staging
  3. Configure collector Lambdas with ClickHouse env vars:

    • CLICKHOUSE_HOST — EC2 private IP
    • CLICKHOUSE_PORT8123 (HTTP within VPC)
    • CLICKHOUSE_USERpolo_ingest
    • CLICKHOUSE_PASSWORD — from Secrets Manager
    • CLICKHOUSE_SECUREfalse (HTTP within VPC; TLS is for external access)

Schema Migrations

# Local (Docker)
polo migrate

# Remote (via SSM port forwarding — fetches credentials from Secrets Manager)
ENV=staging AWS_PROFILE=polo polo migrate-remote <instance-id>

Cloudflare Worker

The Worker serves both the API and the SPA static assets.

# wrangler.toml
name = "polo-api"
main = "src/index.ts"
compatibility_date = "2024-12-01"
assets = { directory = "../ui/dist" }

[vars]
CLICKHOUSE_HOST = ""
CLICKHOUSE_USER = "default"
CLICKHOUSE_DATABASE = "polo"
CF_ACCESS_TEAM_DOMAIN = "marqodev"
CF_ACCESS_AUD = ""

[env.staging]
name = "staging-polo-api"
workers_dev = false
routes = [{ pattern = "polo.dev-marqo.org", custom_domain = true }]

Secrets (set via wrangler secret put):

  • CLICKHOUSE_HOST, CLICKHOUSE_USER, CLICKHOUSE_PASSWORD

Staging Deployment

# 1. Build the SPA
polo build ui

# 2. Set secrets (one-time, values from Secrets Manager polo/staging/*)
cd components/api
npx wrangler secret put CLICKHOUSE_HOST --env staging
npx wrangler secret put CLICKHOUSE_PASSWORD --env staging

# 3. Deploy
polo deploy worker --env staging

# 4. Verify
curl https://polo.dev-marqo.org/api/v1/health

The CLICKHOUSE_HOST secret must be the ClickHouse HTTPS endpoint (e.g. https://<private-ip>:8443). The CLICKHOUSE_USER defaults to polo_api via wrangler.toml vars.

Routing

  • /api/v1/* -> ClickHouse query proxy
  • /* -> SPA static assets (index.html for client-side routing)

Cloudflare Access

Set up an Access policy for the Worker's hostname. Restrict to Marqo team emails. The Worker validates the Access JWT on every request (src/auth.ts).

Collector Lambda Functions

Collectors are packaged by Pants and deployed via CDK (infra/polo/stacks/collectors_stack.py).

Build and Deploy

# Build all collector Lambda zips
pants package components/collectors/...:lambda

# Deploy ClickHouse + collectors to staging
ENV=staging AWS_ACCOUNT_ID=992382409372 LAMBDA_CODE_DIR=dist \
cdk deploy --all --app "python infra/polo/app.py"

# Deploy only ClickHouse
ENV=staging AWS_ACCOUNT_ID=992382409372 \
cdk deploy staging-ClickHouse --app "python infra/polo/app.py"

Cross-Account IAM

Deploy {env}-PoloReadRole to each target account:

ENV=staging AWS_ACCOUNT_ID=468036072962 \
cdk deploy --app "python infra/polo/destination.py"

Collector Lambdas assume this role for read-only access to EC2, EBS, Cost Explorer, CloudWatch, and Tags APIs.

Scheduling

EventBridge rules trigger each Lambda on its schedule. See collection.md for the schedule table.

Infrastructure as Code

Infrastructure is managed with AWS CDK (Python):

infra/polo/ # Polo v2 infrastructure
├── app.py # CDK app: ClickHouse + collectors
├── destination.py # CDK app: PoloReadRole in target accounts
└── stacks/
├── clickhouse_stack.py # VPC + EC2 ClickHouse + Secrets Manager
├── collectors_stack.py # Collector Lambdas + EventBridge schedules
└── destination_stack.py # Cross-account PoloReadRole IAM

infra/legacy/ # Polo v1 (DynamoDB, Cognito, legacy sync)
├── app.py # CDK app entry point
├── config/ # Environment-aware configuration
├── lambda/ # Auth Lambdas (OAuth, Cognito hooks)
├── scripts/ # Deployment scripts (prices, backfill)
└── stacks/
├── polo_stack.py # Main orchestrator
├── api_stack.py # Cloudflare Worker + CloudFront
├── service_stack.py # Lambda + EventBridge orchestration
├── database_stack.py # DynamoDB tables
├── cognito_stack.py # Cognito user pools
├── eventbridge_stack.py # Event scheduling
├── sync_stack.py # Data sync
├── ci_stack.py # CI/CD pipeline
├── route53_stack.py # DNS records
└── destination_stack.py # Event destinations

Legacy CDK Deployment Compatibility

The monorepo restructure moved existing code from root to components/polo-legacy/. To keep the existing CDK deployment CI working:

  • Root tasks.py — shim that adds components/polo-legacy/src to sys.path and delegates invoke deploy/destroy/synth to CDK with updated paths (infra/legacy/app.py)
  • Root requirements.dev.txt — shim that delegates to components/polo-legacy/requirements.dev.txt
  • CDK entry points (infra/legacy/app.py, ci.py, destination.py) — sys.path updated from ./infra to ./infra/legacy
  • CDK stacksSTATIC_DIR updated from src/static to components/polo-legacy/src/static
  • Deploy/synth/destroy tasks use tailwind + zip directly (SAM build skipped since CDK is the deployer)

These shims should be removed once the CDK stacks are fully migrated or the legacy Lambda deployment is replaced.

CI/CD

GitHub Actions Workflows

deploy.yaml — CDK deployment triggered on PR/push to main:

  1. Install CDK + dependencies (pip install -r requirements.dev.txt)
  2. invoke deploy --env=<branch> (creates branch environment)
  3. invoke destroy --env=<branch> (tears down on PR close)

deploy-destination.yaml — Destination stack for cross-account sync roles

ci.yml — Polo v2 validation (3 parallel jobs):

  1. Pants lint + test — ruff, tailor, BUILD files, Python unit tests (excludes schema)
  2. TypeScript tests — API vitest, UI vitest, UI build
  3. Schema tests — ClickHouse service container, migrations, schema verification