Skip to main content

Polo Legacy — Overview & Architecture

Polo v1 is a serverless infrastructure observability and cost analytics platform for Marqo Cloud, built as a Python/HTMX application running on AWS Lambda + API Gateway.

Tech Stack

LayerTechnology
RuntimePython 3.12 on AWS Lambda (256 MB, 30s timeout)
WebAWS Lambda Powertools APIGatewayRestResolver (prod), Flask (dev)
Renderinghtmxido (Python DSL for HTML generation) + HTMX
StylingTailwind CSS 3.4 with custom Marqo palette
JSjQuery 3.7, DataTables, Highcharts, GoJS (diagrams), Hyperscript
Data StoreDynamoDB (ResourceTable + ReportTable)
SyncPython collectors via Lambda, multi-account AWS API calls
DeployAWS SAM + CDK, Tailwind CLI for CSS
AuthExternal auth proxy (/auth/logout redirect)

High-Level Architecture

┌──────────────────────────┐
│ EventBridge Schedules │
└──────┬───┬───┬───┬───────┘
│ │ │ │
┌────────────┘ │ │ └────────────┐
▼ ▼ ▼ ▼
lambda_sync lambda_report lambda_cop lambda_prune
(collect AWS (generate cost (Slack (purge deleted
resources) & action alerts) resources)
reports)
│ │
▼ ▼
┌──────────────────────────┐
│ DynamoDB │
│ - ResourceTable │
│ - ReportTable │
└──────────┬───────────────┘


lambda_function (API Gateway)

├── /data/* → resource browser
├── /stats/* → analytics (mostly stubs)
├── /actions/* → cleanup recommendations
├── /dashboards/* → cost reports & series
├── /budget/* → budget tracking
├── /diagram/* → infrastructure visualization
└── /query/* → freeform data queries

Lambda Functions

FunctionTriggerPurpose
lambda_functionAPI Gateway (HTTP)Main web UI — routes, templates, data serving
lambda_staticAPI Gateway (/static)Serves CSS, JS, images with MIME detection
lambda_syncEventBridge scheduleCollects resources from 16 AWS accounts
lambda_reportEventBridge scheduleGenerates cost, budget, and action reports
lambda_copEventBridge scheduleDetects infrastructure issues, posts to Slack
lambda_pruneEventBridge scheduleDeletes resources marked deleted_at > 1 day

AWS Accounts (16)

Account IDNameEnvironment
023568249301controllerprod
651774330118prodprod
905418443936marqtune_prodprod
014498650341modelsprod
468036072962stagingdev
339712831429preproddev
707042731317coredev
424082663841open_sourcedev
010928202142preprod_controllerdev
975050354766marqtunedev
504304931539s2uidev
992382409372polodev
311141537729escalatordev
311141526841coachdev
780949682512commercialdev
940994029740mldev

Key Design Decisions

  1. Server-side rendering with HTMX — no client-side framework. All HTML is generated in Python via htmxido, with HTMX handling partial page updates. This keeps the frontend extremely lightweight but tightly couples UI to the Python backend.

  2. DynamoDB as primary store — single-table design with type (PK) and path (SK). Hierarchical paths encode resource relationships (e.g., A#{aws_id}/V#{vpc_id}/S#{subnet_id}/I#{instance_id}). Simple but limits ad-hoc querying — hence the /query page.

  3. Multi-account sync via role assumption — a single Lambda assumes roles into 16 target AWS accounts, fetching resources in parallel with ThreadPoolExecutor.

  4. Local dev parityLocalDataService reads/writes a local_data.json file, mirroring the DynamoDB DataService interface. Devs run against cached data without AWS credentials.

  5. Report pre-computation — cost breakdowns and action recommendations are pre-computed by lambda_report and stored in ReportTable, avoiding expensive on-the-fly calculations.

  6. Cost tracking baseline — a "golden date" (June 21, 2025) serves as the cost baseline. All budget comparisons reference this date. Monthly target: $79,167 (~$2,639/day).