Legacy → v2 Gap Analysis
What the legacy system delivers that v2 either doesn't mention or marks as "planned".
Working in legacy, only "planned" in v2
| Legacy Feature | Legacy Status | v2 Status | Risk |
|---|---|---|---|
| Actions/cleanup recommendations (8 detectors: prune clusters, volumes, IPs, NATs; stop notebooks; dev indexes; unroled instances; orphaned instances) | Working | Planned (Phase 9, features/actions.md) | High — this is how the team actually saves money today |
| Budget tracking (golden-date baseline, run-rate vs $79,167/mo target, account selector, usage-type breakdown) | Working page | Planned (features/budget-rules.md) | High — active cost governance depends on this |
| Slack alerting (Cloud Cop) (persistent dev indexes, no-role instances, orphaned instances → Slack webhook) | Working Lambda | Planned as notifications system (features/anomalies.md) | Medium — automated issue surfacing disappears in the gap |
| Interactive infrastructure diagram (GoJS canvas, filter by resource type, detail level, cost overlay) | Working page | Planned as /topology in ui.md | Medium — useful for understanding relationships |
| Cost series / trend charts (Highcharts line charts over time) | Working page | Planned via useCostTrend hook in ui.md | Medium — exists as an API endpoint but no UI |
| SavingsPlan tracking | Collected & displayed | Planned collector (savings_plans) | Medium — needed to compute net cost accurately |
Present in legacy, absent or not mentioned in v2 docs
| Legacy Feature | Notes |
|---|---|
Freeform query builder (/query — SQL textarea, execute, show results) | Not mentioned anywhere in v2. Power-user escape hatch for ad-hoc investigation. |
| Income statement view (Revenue, Gross Cost, Total Savings, Net Cost, Net Profit as a P&L card) | v2 has cost breakdowns but no revenue/profit framing. Legacy presents costs as a business metric, not just infrastructure. |
| Team breakdown cards (DP, CP, Sales & Marketing, Marqtune, OpenSource, AppliedScience with customer/dev split per team) | v2 hierarchy handles generic nodes but doesn't document team-level cost attribution as a first-class view. |
| H/D/M/Y cost display toggle (all costs stored $/hr, rendered as hourly/daily/monthly/yearly via CSS toggle) | Not mentioned. Small but universally used in legacy — people think in monthly cost, not hourly. |
Pricing reference table (/data/prices — instance type → $/hr lookup) | v2 has no equivalent. Legacy uses INSTANCE_PRICES dict for enrichment and display. |
| Jira ticket links from action pages (each cleanup action links to a tracking ticket) | v2 actions doc has no ticketing integration. Legacy connects recommendations to accountability. |
Resource CloudWatch metrics (/data/metrics/<key> — per-resource metric graphs) | Not mentioned in v2. Legacy fetches and displays CloudWatch data per resource. |
Resource types collected in legacy but missing from v2 collector inventory
| Resource Type | Legacy Code | v2 Equivalent |
|---|---|---|
| Load Balancers (Classic/ALB/NLB/Gateway) | LoadBalancerService → type L | Not even planned — absent from all collector lists |
| VPCs | VpcService → type V | Not collected — config_network only covers NATs |
| Subnets | SubnetService → type S | Not collected |
| Cognito Users | UserService → type US | Not collected, not planned |
| Marqo Customer Accounts | AccountService → type AC | Not collected — hierarchy_nodes fills a similar role but doesn't pull from the same source |
| Marqo Indexes | IndexService → type IX | Not collected — hierarchy derives some of this from tags |
| Clusters (K8s) | Derived from instance grouping → type CL | Not collected — hierarchy derives some of this |
| EBS Snapshots | Disabled in legacy but modeled → type SN | Planned in features/actions.md (delete_snapshot) but no collector |
| S3 Buckets | BucketService → type BU | Planned (config_sagemaker list, properties by type) but not implemented |
| SageMaker Notebooks | NotebookService → type NB | Planned (config_sagemaker) but not implemented |
| "Other" catch-all | OtherService → type OT | No equivalent — v2 is strictly typed |
Business logic / enrichment with no v2 equivalent
| Logic | Detail |
|---|---|
| Cloud version detection (v1: tagged legacy, v2: role-based, null: non-cloud) | Used to distinguish Marqo Cloud infrastructure from bare AWS. No equivalent classification in v2. |
| Role inference from instance names/tags (25 roles: inference, vespa-content, control, bastion, metrics, etc.) | Legacy derives functional role per instance. v2 has account_role (prod/dev/staging) but not per-resource functional roles. |
sum_cost rollup on resources (each resource carries cost of all descendants) | Legacy pre-computes this during enrichment. v2 uses cost_rollup_daily table instead — different approach, but the per-resource sum_cost was available for sorting/display. |
Resource lifecycle states (running/stopped/terminated/available/deleted + alive property) | Legacy tracks granular state transitions. v2 has lifecycle events via CloudTrail but doesn't surface a unified "current state" in the same way — resource_snapshots serves this role but state vocabulary isn't documented. |
| Golden date as hardcoded baseline (June 21, 2025 = lowest-cost day) | v2's budget-rules feature uses dynamic thresholds, but the "compare everything to one known-good day" pattern isn't replicated. May be intentional. |
UX patterns with no v2 equivalent
| Pattern | Detail |
|---|---|
| Lazy-loaded subsections via HTMX | Legacy loads each data section on-demand. v2 React SPA loads differently, but the lesson is: the data page has 16+ subsections and they must not all load at once. |
| DataTables on every resource list (sort, filter, paginate, search) | Legacy gets this free from jQuery DataTables. v2 uses @tanstack/react-table — equivalent capability exists but needs to be wired to every resource list. |
| Resource detail → tags → children → metrics as expandable sections | Legacy has a deep drill-down per resource. v2's planned /resources/$arn needs to replicate this depth. |
| Account-priority dropdown (Total, prod, staging, open_source, controller, preprod, then alpha) | Opinionated ordering so the most important accounts are always first. Easy to overlook. |
Summary
Highest-risk gaps are the three things people use daily that are working in legacy and only planned in v2: actions/cleanup, budget tracking, and Slack alerting.
The biggest data coverage gap is Load Balancers — collected in legacy, not even planned in v2.