Testing Strategy

Working Methodology

Work in phases. Each phase must be fully tested and verified before proceeding. If any test fails:

Read the error message carefully.
Identify the root cause (do not guess — inspect logs, query state, print debug output).
Fix the root cause (not the symptom).
Re-run ALL tests in the current phase to verify the fix didn't break anything.
Only then proceed to the next phase.

Never skip tests. Never leave a failing test behind. Never comment out a test to make the suite pass. If a test is wrong, fix the test and document why.

Use aws CLI to inspect real resources when debugging AWS integration issues.

Test Tiers

Tier	What	Requires	Speed
Unit	Mocked AWS, pure logic	Nothing	Fast
Integration	Real ClickHouse with test data	Docker CH running	Medium
Performance	100K+ rows, latency benchmarks	Docker CH running	Medium
E2E	Real AWS resources, full pipeline	AWS creds + CH	Slow

Phase Checkpoints

Phase	Focus	Status	Checkpoint
0	Project bootstrap	Done	ClickHouse running, responds to ping
1	Schema	Done	All tables exist, correct columns, inserts work, MV triggers, dictionary resolves
2	Common library	Done	ARN parser, tag resolver, hierarchy resolver, CH client all pass
3	Collectors	Done	All unit tests (mocked) + integration tests (real CH) green
4	Integration	Done	Full pipeline round-trips, hierarchy rollups, MV correctness
5	Performance	Done	Query latency targets met, batch insert performance verified
6	API	Done	Query generation, routing, auth all pass
7	UI	Done	Component tests pass, `npm run build` succeeds
8	E2E	Done	Real AWS resources detected, hierarchy correct, tags propagate, cleanup verified
9	Actions	Not started	Safety tests, dry-run, suggestions, savings, full action e2e pipeline
10	Multi-account	Not started	Account discovery, cross-account collection, CUR v2, collector_runs
11	Governance	Not started	Budget rules, anomalies, Slack notifications, weekly digest
12	Multi-platform	Not started	GitHub/CF collectors, pricing, daily snapshots, allocation, system status
13	Forecasting	Not started	Forecasts, changelog diffs, allocated costs
14	Deployment	Not started	Worker deploys, SPA loads, all API endpoints respond, auth works

Test Files

Implemented

Unit tests (`components/collectors/`)

Each collector has a test file alongside its handler. Common library tests:

components/collectors/common/test_arn.py
components/collectors/common/test_tag_resolver.py
components/collectors/common/test_hierarchy_resolver.py
components/collectors/common/test_normaliser.py
components/collectors/common/test_models.py

Integration tests (`tests/integration/`)

test_ingestion_pipeline.py — Full pipeline round-trips
test_hierarchy_rollup.py — Hierarchy building and cost rollup
test_materialised_views.py — MV correctness

Performance tests (`tests/performance/`)

test_query_latency.py — Query performance benchmarks
test_ingest_throughput.py — Batch insert performance

E2E tests (`tests/e2e/`)

test_ec2_lifecycle.py — EC2 instance lifecycle detection
test_ebs_hierarchy.py — Instance + volume + snapshot hierarchy chain
test_cleanup_verification.py — No test resources remain after suite

Schema tests (`components/schema/`)

test_schema.py — Schema validation

UI e2e tests (`components/ui/e2e/`)

Playwright tests run against the Vite preview server with mocked API responses (no real backend required).

smoke.spec.ts — 12 smoke tests: app load, navigation to all 5 pages, delta drill-down + back, hierarchy node click + cost trend, theme cycling
screenshots.spec.ts — 14 screenshot tests (7 pages/states x 2 themes): dashboard, delta, delta drill-down, costs, resources, hierarchy, hierarchy with trend
fixtures.ts — Mock API data and shared mockApi() route handler for all endpoints
screenshots/ — 14 captured PNGs for visual reference

Run with: cd components/ui && npm run test:e2e

Planned (for phases 9-14)

components/collectors/action_executor/test_safety.py — Protected tags, production restrictions, minimum age, batch limits
components/collectors/action_executor/test_executor.py — Dry-run makes no API calls, real execution correct
tests/integration/test_delta_decomposition.py — "Why are costs different?" at all three levels
tests/e2e/test_actions.py — Orphaned volume -> suggestion -> preview -> execute -> verify

Performance Targets

Query	Target
Delta decomposition (100K events, 500 resources)	< 500ms
Resource drill-down within a node	< 300ms
Cost-by-customer	< 500ms
Fast-path (denormalised column GROUP BY)	< 200ms
Snapshot lookup	< 50ms
10K batch insert	< 2s

E2E Resource Rules

All test resources created in AWS must:

Be tagged with polo:test=true AND Name=polo-e2e-test-<timestamp>
Use the cheapest possible options: t3.nano instances (stopped immediately), 1GB gp3 volumes
Be cleaned up in a finally block even on test failure
Have a final verification test (test_no_test_resources_remain_after_cleanup)

Manual verification commands

# Verify AWS credentials
aws sts get-caller-identity

# Check for leaked test resources
aws ec2 describe-instances --filters "Name=tag:polo:test,Values=true" \
    --query 'Reservations[].Instances[].{ID:InstanceId,State:State.Name}'

aws ec2 describe-volumes --filters "Name=tag:polo:test,Values=true" \
    --query 'Volumes[].{ID:VolumeId,State:State}'

aws ec2 describe-snapshots --owner-ids self --filters "Name=tag:polo:test,Values=true" \
    --query 'Snapshots[].{ID:SnapshotId}'

Working Methodology​

Test Tiers​

Phase Checkpoints​

Test Files​

Implemented​

Unit tests (components/collectors/)​

Integration tests (tests/integration/)​

Performance tests (tests/performance/)​

E2E tests (tests/e2e/)​

Schema tests (components/schema/)​

UI e2e tests (components/ui/e2e/)​

Planned (for phases 9-14)​

Performance Targets​

E2E Resource Rules​

Manual verification commands​