Deployment
Local Development
# Start ClickHouse
polo up
curl http://localhost:8123/ping # Should return "Ok."
# Apply schema
polo migrate
# Seed test data
polo seed
# Dev UI
cd components/ui && npm run dev
# Dev Worker
cd components/api && npx wrangler dev
ClickHouse
Options
| Option | Ops burden | Cost | Latency | Recommendation |
|---|---|---|---|---|
| Self-hosted on EC2 | Medium | Covered by savings plan | ~50ms | Default choice |
| ClickHouse Cloud (serverless) | Zero | ~$50-150/mo | ~100ms | Not recommended (see below) |
| ClickHouse Cloud (dedicated) | Low | ~$200/mo | ~20ms | Not recommended (see below) |
Why self-hosted: EC2 is covered by our existing savings plan, making it effectively free. More importantly, Polo processes potentially sensitive infrastructure metadata including customer resource details — keeping ClickHouse within our VPC avoids shipping that data to a third party, which is better for both security and commercial risk. The collectors connect via HTTPS and don't care which deployment model is used, so migrating later is straightforward if needed.
CDK Stack
The ClickHouseStack (in infra/polo/stacks/clickhouse_stack.py) provisions:
- VPC with public + private subnets (1 NAT gateway)
- EC2 instance (m6i.large, Amazon Linux 2023, 100 GB gp3 EBS)
- Security group — ingress rules added by dependent stacks (Lambda SGs, Cloudflare Tunnel)
- IAM role with SSM managed instance access (no SSH)
- Secrets Manager —
polo_apiandpolo_ingestpasswords (auto-generated) - User data installs ClickHouse 24.3, configures TLS (self-signed), creates database users
Security (production)
- Private subnet with no public IP
- Self-signed TLS (CA + server cert generated at launch, SAN includes instance private IP)
- HTTPS on port 8443 (in addition to HTTP 8123 for local access)
- Dedicated
polo_apiuser with read-only access to thepolodatabase (for the Worker) - Dedicated
polo_ingestuser with insert access (for collector Lambdas) - Credentials stored in Secrets Manager
- Accept connections only from Worker's IP range or via Cloudflare Tunnel
Post-Deploy Steps
After deploying the ClickHouse stack for the first time:
-
Run schema migrations via SSM port forwarding:
# Retrieves ingest credentials from Secrets Manager automaticallyENV=staging AWS_PROFILE=polo polo migrate-remote <instance-id> -
Set Worker secrets (CLICKHOUSE_HOST and CLICKHOUSE_PASSWORD):
wrangler secret put CLICKHOUSE_HOST --env staging # https://<private-ip>:8443wrangler secret put CLICKHOUSE_PASSWORD --env staging -
Configure collector Lambdas with ClickHouse env vars:
CLICKHOUSE_HOST— EC2 private IPCLICKHOUSE_PORT—8123(HTTP within VPC)CLICKHOUSE_USER—polo_ingestCLICKHOUSE_PASSWORD— from Secrets ManagerCLICKHOUSE_SECURE—false(HTTP within VPC; TLS is for external access)
Schema Migrations
# Local (Docker)
polo migrate
# Remote (via SSM port forwarding — fetches credentials from Secrets Manager)
ENV=staging AWS_PROFILE=polo polo migrate-remote <instance-id>
Cloudflare Worker
The Worker serves both the API and the SPA static assets.
# wrangler.toml
name = "polo-api"
main = "src/index.ts"
compatibility_date = "2024-12-01"
assets = { directory = "../ui/dist" }
[vars]
CLICKHOUSE_HOST = ""
CLICKHOUSE_USER = "default"
CLICKHOUSE_DATABASE = "polo"
CF_ACCESS_TEAM_DOMAIN = "marqodev"
CF_ACCESS_AUD = ""
[env.staging]
name = "staging-polo-api"
workers_dev = false
routes = [{ pattern = "polo.dev-marqo.org", custom_domain = true }]
Secrets (set via wrangler secret put):
CLICKHOUSE_HOST,CLICKHOUSE_USER,CLICKHOUSE_PASSWORD
Staging Deployment
# 1. Build the SPA
polo build ui
# 2. Set secrets (one-time, values from Secrets Manager polo/staging/*)
cd components/api
npx wrangler secret put CLICKHOUSE_HOST --env staging
npx wrangler secret put CLICKHOUSE_PASSWORD --env staging
# 3. Deploy
polo deploy worker --env staging
# 4. Verify
curl https://polo.dev-marqo.org/api/v1/health
The CLICKHOUSE_HOST secret must be the ClickHouse HTTPS endpoint
(e.g. https://<private-ip>:8443). The CLICKHOUSE_USER defaults to
polo_api via wrangler.toml vars.
Routing
/api/v1/*-> ClickHouse query proxy/*-> SPA static assets (index.html for client-side routing)
Cloudflare Access
Set up an Access policy for the Worker's hostname. Restrict to Marqo team emails. The Worker validates the Access JWT on every request (src/auth.ts).
Collector Lambda Functions
Collectors are packaged by Pants and deployed via CDK (infra/polo/stacks/collectors_stack.py).
Build and Deploy
# Build all collector Lambda zips
pants package components/collectors/...:lambda
# Deploy ClickHouse + collectors to staging
ENV=staging AWS_ACCOUNT_ID=992382409372 LAMBDA_CODE_DIR=dist \
cdk deploy --all --app "python infra/polo/app.py"
# Deploy only ClickHouse
ENV=staging AWS_ACCOUNT_ID=992382409372 \
cdk deploy staging-ClickHouse --app "python infra/polo/app.py"
Cross-Account IAM
Deploy {env}-PoloReadRole to each target account:
ENV=staging AWS_ACCOUNT_ID=468036072962 \
cdk deploy --app "python infra/polo/destination.py"
Collector Lambdas assume this role for read-only access to EC2, EBS, Cost Explorer, CloudWatch, and Tags APIs.
Scheduling
EventBridge rules trigger each Lambda on its schedule. See collection.md for the schedule table.
Infrastructure as Code
Infrastructure is managed with AWS CDK (Python):
infra/polo/ # Polo v2 infrastructure
├── app.py # CDK app: ClickHouse + collectors
├── destination.py # CDK app: PoloReadRole in target accounts
└── stacks/
├── clickhouse_stack.py # VPC + EC2 ClickHouse + Secrets Manager
├── collectors_stack.py # Collector Lambdas + EventBridge schedules
└── destination_stack.py # Cross-account PoloReadRole IAM
infra/legacy/ # Polo v1 (DynamoDB, Cognito, legacy sync)
├── app.py # CDK app entry point
├── config/ # Environment-aware configuration
├── lambda/ # Auth Lambdas (OAuth, Cognito hooks)
├── scripts/ # Deployment scripts (prices, backfill)
└── stacks/
├── polo_stack.py # Main orchestrator
├── api_stack.py # Cloudflare Worker + CloudFront
├── service_stack.py # Lambda + EventBridge orchestration
├── database_stack.py # DynamoDB tables
├── cognito_stack.py # Cognito user pools
├── eventbridge_stack.py # Event scheduling
├── sync_stack.py # Data sync
├── ci_stack.py # CI/CD pipeline
├── route53_stack.py # DNS records
└── destination_stack.py # Event destinations
Legacy CDK Deployment Compatibility
The monorepo restructure moved existing code from root to components/polo-legacy/. To keep the existing CDK deployment CI working:
- Root
tasks.py— shim that addscomponents/polo-legacy/srctosys.pathand delegatesinvoke deploy/destroy/synthto CDK with updated paths (infra/legacy/app.py) - Root
requirements.dev.txt— shim that delegates tocomponents/polo-legacy/requirements.dev.txt - CDK entry points (
infra/legacy/app.py,ci.py,destination.py) —sys.pathupdated from./infrato./infra/legacy - CDK stacks —
STATIC_DIRupdated fromsrc/statictocomponents/polo-legacy/src/static - Deploy/synth/destroy tasks use
tailwind+zipdirectly (SAM build skipped since CDK is the deployer)
These shims should be removed once the CDK stacks are fully migrated or the legacy Lambda deployment is replaced.
CI/CD
GitHub Actions Workflows
deploy.yaml — CDK deployment triggered on PR/push to main:
- Install CDK + dependencies (
pip install -r requirements.dev.txt) invoke deploy --env=<branch>(creates branch environment)invoke destroy --env=<branch>(tears down on PR close)
deploy-destination.yaml — Destination stack for cross-account sync roles
ci.yml — Polo v2 validation (3 parallel jobs):
- Pants lint + test — ruff, tailor, BUILD files, Python unit tests (excludes schema)
- TypeScript tests — API vitest, UI vitest, UI build
- Schema tests — ClickHouse service container, migrations, schema verification