Skip to main content

Anomaly Detection & Notifications

Status: PLANNED — This feature is not yet implemented. The anomalies and notifications tables, the anomaly_detector and weekly_digest collectors, and the /api/v1/anomalies endpoint do not exist yet. This document is the design specification for future development.

Anomaly Detection

A daily job computes a 7-day rolling baseline (mean + stddev) for each hierarchy node's daily cost, then flags any node where today's cost exceeds 2σ.

Detection runs at customer and cluster granularity (not per-resource — too noisy). Per-resource drill-down is available on demand through the delta decomposition view.

Severity thresholds

DeviationSeverity
2σ - 3σinfo
3σ - 4σwarning
4σ+critical

Schema: polo.anomalies

CREATE TABLE polo.anomalies
(
anomaly_id UUID DEFAULT generateUUIDv4(),
detected_at DateTime64(3),
resolved_at Nullable(DateTime64(3)),
node_id String,
node_type LowCardinality(String),
hierarchy LowCardinality(String),
metric LowCardinality(String), -- 'daily_cost_usd'
expected_value Float64,
actual_value Float64,
stddev Float64,
deviation_sigma Float64,
severity LowCardinality(String),
message String,
notified UInt8 DEFAULT 0,
_version UInt64
)
ENGINE = ReplacingMergeTree(_version) ORDER BY (anomaly_id);

Notification System

All alerting funnels through a single notification system: anomalies, rule violations, account coverage gaps, collector failures, weekly digest.

Schema: polo.notifications

CREATE TABLE polo.notifications
(
notification_id UUID DEFAULT generateUUIDv4(),
created_at DateTime64(3) DEFAULT now64(3),
channel LowCardinality(String), -- 'slack'
source_type LowCardinality(String), -- 'anomaly', 'rule_violation', 'account_coverage', 'collector_failure'
source_id String,
recipient String,
message String,
delivered UInt8 DEFAULT 0,
delivered_at Nullable(DateTime64(3)),
error String DEFAULT ''
)
ENGINE = MergeTree() ORDER BY (created_at, notification_id);

Deduplication

Don't re-notify for the same (source_type, source_id) within a 24-hour cooldown.

Delivery

Slack webhook URL stored in AWS Secrets Manager. A Lambda reads undelivered notifications and sends them.

Weekly Digest

A weekly job posts to Slack with:

  • Cost delta vs previous week (total + top 5 movers)
  • New resources created / terminated count
  • Open rule violations by severity
  • Actions taken + total savings this week
  • Accounts without Polo access