Budget Rules & Zero-Based Budgeting
Status: PLANNED — This feature is not yet implemented. The
budget_rulesandrule_violationstables, therule_evaluatorcollector, and the/api/v1/rules/*endpoints do not exist yet. This document is the design specification for future development.
Every dollar of spend needs a declared justification. The rules engine continuously compares reality against declared intent.
Rule Types
| Type | What it checks | Example |
|---|---|---|
budget | Cost metric ≤ threshold | "Testing accounts total < $500/month" |
existence | Resource count/state within bounds | "No running instances in staging outside business hours" |
config | Resource property matches expectations | "No io1/io2 volumes in staging" |
lifecycle | Resource age within bounds | "No SageMaker notebook running > 8 hours" |
ratio | Cost proportion within bounds | "NAT cost < 15% of cluster total" |
Default Rules (seeded at migration)
- Staging off-hours — No running instances in testing accounts outside business hours (Melbourne TZ)
- SageMaker notebook limit — No notebook InService for > 8 hours
- Untagged resource cost — No untagged resource costing > $1/day
- Testing account budget — Testing accounts total < $500/month
- NAT cost ratio — NAT gateway cost < 15% of cluster total
Schema
budget_rules
CREATE TABLE polo.budget_rules
(
rule_id String,
rule_name String,
rule_type LowCardinality(String),
scope_hierarchy LowCardinality(String), -- 'marqo_logical', 'aws_account', '*'
scope_node_id String, -- 'customer:acme', 'account:222', '*'
scope_filters Map(String, String), -- {'resource_type': 'ec2:instance', 'marqo_env': 'staging'}
condition Map(String, String), -- type-specific condition parameters
severity LowCardinality(String), -- 'info', 'warning', 'critical'
notification_channel String DEFAULT '', -- 'slack', 'email', ''
enabled UInt8 DEFAULT 1,
created_by String,
created_at DateTime64(3),
updated_at DateTime64(3),
_version UInt64
)
ENGINE = ReplacingMergeTree(_version) ORDER BY (rule_id);
rule_violations
CREATE TABLE polo.rule_violations
(
violation_id UUID DEFAULT generateUUIDv4(),
rule_id String,
rule_name String,
rule_type LowCardinality(String),
severity LowCardinality(String),
detected_at DateTime64(3),
resolved_at Nullable(DateTime64(3)),
resource_arn String DEFAULT '',
node_id String DEFAULT '',
actual_value String,
threshold_value String,
message String,
notified UInt8 DEFAULT 0,
notified_at Nullable(DateTime64(3)),
_version UInt64
)
ENGINE = ReplacingMergeTree(_version) ORDER BY (rule_id, detected_at, violation_id);
Rule Evaluator
A scheduled collector that reads all enabled rules, executes the appropriate query, compares results against thresholds, creates/resolves violations, and sends notifications.
- Budget/ratio rules: evaluated hourly
- Existence/config/lifecycle rules: evaluated every 15 minutes
UI
- Policies page: Rule list with rule builder form (not raw SQL)
- Violations feed: Filtered by severity/scope/rule type
- Compliance score: Per hierarchy node (e.g. "customer acme: 94% compliant, 3 open violations")