Polo Legacy — Overview & Architecture
Polo v1 is a serverless infrastructure observability and cost analytics platform for Marqo Cloud, built as a Python/HTMX application running on AWS Lambda + API Gateway.
Tech Stack
| Layer | Technology |
|---|---|
| Runtime | Python 3.12 on AWS Lambda (256 MB, 30s timeout) |
| Web | AWS Lambda Powertools APIGatewayRestResolver (prod), Flask (dev) |
| Rendering | htmxido (Python DSL for HTML generation) + HTMX |
| Styling | Tailwind CSS 3.4 with custom Marqo palette |
| JS | jQuery 3.7, DataTables, Highcharts, GoJS (diagrams), Hyperscript |
| Data Store | DynamoDB (ResourceTable + ReportTable) |
| Sync | Python collectors via Lambda, multi-account AWS API calls |
| Deploy | AWS SAM + CDK, Tailwind CLI for CSS |
| Auth | External auth proxy (/auth/logout redirect) |
High-Level Architecture
┌──────────────────────────┐
│ EventBridge Schedules │
└──────┬───┬───┬───┬───────┘
│ │ │ │
┌────────────┘ │ │ └────────────┐
▼ ▼ ▼ ▼
lambda_sync lambda_report lambda_cop lambda_prune
(collect AWS (generate cost (Slack (purge deleted
resources) & action alerts) resources)
reports)
│ │
▼ ▼
┌──────────────────────────┐
│ DynamoDB │
│ - ResourceTable │
│ - ReportTable │
└──────────┬───────────────┘
│
▼
lambda_function (API Gateway)
│
├── /data/* → resource browser
├── /stats/* → analytics (mostly stubs)
├── /actions/* → cleanup recommendations
├── /dashboards/* → cost reports & series
├── /budget/* → budget tracking
├── /diagram/* → infrastructure visualization
└── /query/* → freeform data queries
Lambda Functions
| Function | Trigger | Purpose |
|---|---|---|
lambda_function | API Gateway (HTTP) | Main web UI — routes, templates, data serving |
lambda_static | API Gateway (/static) | Serves CSS, JS, images with MIME detection |
lambda_sync | EventBridge schedule | Collects resources from 16 AWS accounts |
lambda_report | EventBridge schedule | Generates cost, budget, and action reports |
lambda_cop | EventBridge schedule | Detects infrastructure issues, posts to Slack |
lambda_prune | EventBridge schedule | Deletes resources marked deleted_at > 1 day |
AWS Accounts (16)
| Account ID | Name | Environment |
|---|---|---|
| 023568249301 | controller | prod |
| 651774330118 | prod | prod |
| 905418443936 | marqtune_prod | prod |
| 014498650341 | models | prod |
| 468036072962 | staging | dev |
| 339712831429 | preprod | dev |
| 707042731317 | core | dev |
| 424082663841 | open_source | dev |
| 010928202142 | preprod_controller | dev |
| 975050354766 | marqtune | dev |
| 504304931539 | s2ui | dev |
| 992382409372 | polo | dev |
| 311141537729 | escalator | dev |
| 311141526841 | coach | dev |
| 780949682512 | commercial | dev |
| 940994029740 | ml | dev |
Key Design Decisions
-
Server-side rendering with HTMX — no client-side framework. All HTML is generated in Python via htmxido, with HTMX handling partial page updates. This keeps the frontend extremely lightweight but tightly couples UI to the Python backend.
-
DynamoDB as primary store — single-table design with
type(PK) andpath(SK). Hierarchical paths encode resource relationships (e.g.,A#{aws_id}/V#{vpc_id}/S#{subnet_id}/I#{instance_id}). Simple but limits ad-hoc querying — hence the/querypage. -
Multi-account sync via role assumption — a single Lambda assumes roles into 16 target AWS accounts, fetching resources in parallel with
ThreadPoolExecutor. -
Local dev parity —
LocalDataServicereads/writes alocal_data.jsonfile, mirroring the DynamoDBDataServiceinterface. Devs run against cached data without AWS credentials. -
Report pre-computation — cost breakdowns and action recommendations are pre-computed by
lambda_reportand stored in ReportTable, avoiding expensive on-the-fly calculations. -
Cost tracking baseline — a "golden date" (June 21, 2025) serves as the cost baseline. All budget comparisons reference this date. Monthly target: $79,167 (~$2,639/day).