Skip to main content

Forecasting, Changelog & Daily Snapshots

Status: PLANNED — This feature is not yet implemented. The resource_snapshots_daily and cost_forecasts tables, and the daily_snapshot, forecaster, and changelog collectors do not exist yet. This document is the design specification for future development.

Daily Resource Snapshots

A nightly job copies resource_snapshots into resource_snapshots_daily with the current date. This enables historical comparisons ("what did customer X have on February 15th?"), changelog diffs, and forecasting.

CREATE TABLE polo.resource_snapshots_daily
(
snapshot_date Date,
resource_arn String,
resource_type LowCardinality(String),
resource_id String,
resource_name String,
aws_account_id LowCardinality(String),
aws_region LowCardinality(String),
state LowCardinality(String),
first_seen DateTime64(3),
last_seen DateTime64(3),
marqo_customer LowCardinality(String),
marqo_cluster String,
marqo_index String,
marqo_env LowCardinality(String),
account_role LowCardinality(String),
properties Map(String, String),
tags Map(String, String),
cost_daily_usd Float64,
cost_monthly_usd Float64
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(snapshot_date)
ORDER BY (snapshot_date, resource_arn)
TTL snapshot_date + INTERVAL 1 YEAR;

Changelog — "What Changed?"

Diffing two consecutive daily snapshots produces a changelog. This feeds both the UI changelog page and the weekly digest.

Change types

TypeDetection
createdIn today's snapshot but not yesterday's
terminatedIn yesterday's snapshot but not today's
state_changedSame ARN, different state value
cost_changedSame ARN, daily cost changed by >20%
config_changedSame ARN, properties map differs

SQL example

-- Resources created today
SELECT t.resource_arn, t.resource_type, t.resource_name, t.cost_daily_usd
FROM polo.resource_snapshots_daily t
LEFT ANTI JOIN polo.resource_snapshots_daily y
ON t.resource_arn = y.resource_arn AND y.snapshot_date = today() - 1
WHERE t.snapshot_date = today();

-- Resources terminated
SELECT y.resource_arn, y.resource_type, y.resource_name, y.cost_daily_usd
FROM polo.resource_snapshots_daily y
LEFT ANTI JOIN polo.resource_snapshots_daily t
ON y.resource_arn = t.resource_arn AND t.snapshot_date = today()
WHERE y.snapshot_date = today() - 1;

Cost Forecasting

Simple day-of-week-aware projection using the previous 4 weeks. Uses the same day-of-week from prior weeks as the baseline, which handles weekend dips and weekday patterns better than straight linear extrapolation.

-- Projected remaining cost this month for a node
SELECT
avg_daily * dateDiff('day', today(), toLastDayOfMonth(today())) + mtd_actual
AS projected_month_total
FROM (
SELECT avg(cost_usd) AS avg_daily
FROM polo.cost_rollup_daily
WHERE day >= today() - 28
AND toDayOfWeek(day) = toDayOfWeek(today())
AND node_id = {nodeId:String}
) AS proj,
(
SELECT sum(cost_usd) AS mtd_actual
FROM polo.cost_rollup_daily
WHERE day >= toStartOfMonth(today())
AND node_id = {nodeId:String}
) AS mtd;

Forecast accuracy tracking

CREATE TABLE polo.cost_forecasts
(
forecast_date Date, -- when the forecast was made
target_date Date, -- what date/month being forecast
node_id String,
hierarchy LowCardinality(String),
forecast_usd Float64,
actual_usd Float64 DEFAULT 0, -- filled in after target_date passes
method LowCardinality(String) -- 'dow_4week', 'linear', etc.
)
ENGINE = MergeTree()
ORDER BY (forecast_date, node_id, target_date);

After a target date passes, a job fills in actual_usd so the UI can show forecast accuracy over time, building trust in the projections.

UI

  • Forecast page (/costs/forecast): Budget burn-down chart showing actual spend to date + projected remaining, against a budget line. Warning when projected end-of-month exceeds budget.
  • Changelog page (/changelog): Daily diff with filters by change type. Shows created/terminated/changed resources with cost impact.