Aerarium: Cost Tracking Module

Date: 2026-02-16 Status: Plan Author: Antony

Accurate, persistent token and USD cost tracking for all centurio and legatus API calls, with Telegram reporting commands and budget alerts.

Context

Legio currently tracks only cumulative input_tokens per centurio session (SessionTokenTracker in session.py), solely to decide when to reset a session at 150k tokens. It ignores output tokens (5× more expensive than input), cache tokens, and USD cost entirely. There is no cross-session accounting, no per-centurio cost visibility, no daily/monthly totals, and no budget alerting.

The SDK's ResultMessage provides two relevant fields:

usage: dict — raw token breakdown (input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens)
total_cost_usd: float | None — populated by Claude Code CLI only, always None in programmatic SDK use. We cannot rely on it, but always store it when present as a cross-check reference.

API cost vs. subscription cost

All cost calculations in Aerarium are API-equivalent cost — what the usage would cost at published per-token rates. This is the only number we can compute from token counts.

Billing model	What Aerarium shows	Real cost
API (pay-per-token)	Actual cost, accurate to the cent	= Aerarium number
Subscription (Pro/Team/Max)	API-equivalent cost	Flat monthly fee ($20-200), unrelated to tokens

For subscription users, Aerarium numbers are still valuable:

Relative cost between centuriones (who is expensive vs. cheap)
Trend tracking (usage growing or shrinking over time)
Capacity planning (would API billing be cheaper than subscription?)
Budget alerts as consumption caps (regardless of billing model)

The billing_mode config setting controls /cost output language:

api mode: 💰 Cost today: $12.34
subscription mode: 💰 API-equivalent today: $12.34

Honest about what the number means. Never claims subscription users "spent" money they didn't.

Design Decisions: Hard-Coded vs. Config vs. Prompted

Concern	Decision	Rationale
Cost formula	Hard-coded	Pure arithmetic: `Σ(tokens × rate) / 1M`. Deterministic, no judgment needed.
Usage field names	Hard-coded	API schema is stable (`input_tokens`, `output_tokens`, `cache_creation_input_tokens`, `cache_read_input_tokens`). Infrastructure code.
Token prices	`legio.toml` with hard-coded defaults	Prices change when new models ship. No Anthropic pricing API exists. Config is the right middle ground — easy to update, version-controlled, defaults protect against missing config.
Model → price mapping	`legio.toml` `[pricing]` table	Caesar may use different models for different centuriones later. Config maps model identifiers to rate tiers.
Budget thresholds	`legio.toml`	`daily_budget_usd`, `monthly_budget_usd` — Caesar sets limits.
Alert delivery	Hard-coded	Telegram message to Caesar. Simple conditional, no LLM.
`/cost` command	Hard-coded handler	Like `/status` and `/history` — structured data, not LLM generation.
Report formatting	Hard-coded templates	Token counts and USD amounts are numbers, not prose.
Persistence	SQLite (new table in praetorium.db)	Survives restarts. Same pattern as existing nuntii table.

Why NOT prompted?

Cost reporting is accounting, not conversation. Routing /cost through Legatus would:

Add 2-5s latency (API round-trip) to an instant database query
Burn tokens to report token burns (ironic)
Produce non-deterministic output for deterministic data

Hard-coded handlers return results in milliseconds at zero API cost — same pattern as /status, /edicta, /history.

Tracking price changes

Anthropic has no pricing API. Prices change at model launches (few times per year). Three-layer strategy:

Defaults in code. PricingTier dataclass ships with current prices for known models. Works out of the box, zero config.
Override in legio.toml. [pricing.sonnet], [pricing.haiku] etc. Caesar updates when Anthropic changes prices. TOML beats defaults.
Unknown model fallback. If a centurio uses a model not in the pricing table, log a warning and use a configurable fallback model (default: Sonnet — the most commonly used tier).

Current prices (2026-02-15, from Anthropic official pricing):

Model	Input	Output	Cache Write (5m)	Cache Read
Sonnet 4.5 / 4	$3/MTok	$15/MTok	$3.75/MTok	$0.30/MTok
Haiku 4.5	$1/MTok	$5/MTok	$1.25/MTok	$0.10/MTok
Opus 4.5 / 4.6	$5/MTok	$25/MTok	$6.25/MTok	$0.50/MTok
Opus 4.1 / 4	$15/MTok	$75/MTok	$18.75/MTok	$1.50/MTok

Plus: web search at $10/1000 searches (tracked separately if needed).

Domain Vocabulary

Following the Roman naming convention:

Aerarium — the treasury; the cost-tracking module. (The Aerarium Saturni was Rome's state treasury in the Temple of Saturn.)
Stipendium / Stipendia — a single cost record. (Originally: a soldier's pay or military allowance.)

Architecture

New module: `legio/aerarium.py`

PricingTier (dataclass, frozen)
├── input_per_mtok: float
├── output_per_mtok: float
├── cache_write_per_mtok: float
├── cache_read_per_mtok: float

PricingConfig (dataclass, frozen)
├── tiers: dict[str, PricingTier]  # model name → tier
├── fallback: str                   # fallback model name
├── billing_mode: str               # "api" or "subscription"
├── daily_budget_usd: float | None
├── monthly_budget_usd: float | None

Stipendium (dataclass)
├── id: str                         # UUID
├── timestamp: datetime
├── sender: str                     # centurio name or "legatus"
├── model: str                      # e.g. "sonnet"
├── input_tokens: int
├── output_tokens: int
├── cache_write_tokens: int
├── cache_read_tokens: int
├── cost_usd: float                 # our calculation (API-equivalent)
├── sdk_cost_usd: float | None      # from ResultMessage.total_cost_usd (reference)
├── session_id: str | None

CostSummary (dataclass)
├── input_tokens: int
├── output_tokens: int
├── cache_write_tokens: int
├── cache_read_tokens: int
├── cost_usd: float
├── request_count: int

BudgetStatus (dataclass)
├── daily_spent: float
├── daily_limit: float | None
├── monthly_spent: float
├── monthly_limit: float | None
├── daily_exceeded: bool
├── monthly_exceeded: bool

Aerarium (class)
├── __init__(db_path: Path, pricing: PricingConfig)
├── async open() → None             # create table if needed
├── async close() → None
├── async record_stipendium(sender, model, usage_dict) → Stipendium
├── async get_summary(sender?, since?) → CostSummary
├── async get_breakdown(since?) → dict[str, CostSummary]
├── async get_budget_status() → BudgetStatus
├── calculate_cost(model, usage_dict) → float   # pure function

SQLite schema (new table in praetorium.db)

sql

CREATE TABLE IF NOT EXISTS stipendia (
    id TEXT PRIMARY KEY,
    timestamp TEXT NOT NULL,
    sender TEXT NOT NULL,
    model TEXT NOT NULL,
    input_tokens INTEGER NOT NULL DEFAULT 0,
    output_tokens INTEGER NOT NULL DEFAULT 0,
    cache_write_tokens INTEGER NOT NULL DEFAULT 0,
    cache_read_tokens INTEGER NOT NULL DEFAULT 0,
    cost_usd REAL NOT NULL DEFAULT 0.0,
    sdk_cost_usd REAL,
    session_id TEXT
);
CREATE INDEX IF NOT EXISTS idx_stipendia_timestamp ON stipendia(timestamp);
CREATE INDEX IF NOT EXISTS idx_stipendia_sender ON stipendia(sender);

Config addition in `legio.toml`

toml

[pricing]
billing_mode = "api"      # "api" (real cost) or "subscription" (API-equivalent)
fallback = "sonnet"
daily_budget_usd = 50.0
monthly_budget_usd = 500.0

[pricing.sonnet]
input_per_mtok = 3.0
output_per_mtok = 15.0
cache_write_per_mtok = 3.75
cache_read_per_mtok = 0.30

[pricing.haiku]
input_per_mtok = 1.0
output_per_mtok = 5.0
cache_write_per_mtok = 1.25
cache_read_per_mtok = 0.10

[pricing.opus]
input_per_mtok = 5.0
output_per_mtok = 25.0
cache_write_per_mtok = 6.25
cache_read_per_mtok = 0.50

Integration Points

1. `session.py` — `collect_response()` (the choke point)

Every SDK response flows through collect_response(). When it sees a ResultMessage, it currently calls tracker.update(msg). We add:

python

if isinstance(msg, ResultMessage):
    if tracker is not None:
        tracker.update(msg)
    if aerarium is not None:
        await aerarium.record_stipendium(
            sender=sender,
            model=model,
            usage=msg.usage,
            sdk_cost_usd=msg.total_cost_usd,  # store when present
        )

The aerarium, sender, and model params are threaded down from CenturioSessionManager.dispatch(). This is the only production code path change in session.py.

2. `praetorium.py` — schema extension

Add the stipendia table creation to the existing _SCHEMA string. No new database file — reuse praetorium.db. Consistent with the single-database pattern already established.

3. `config.py` — pricing config loading

Extend load_config() to parse [pricing] from TOML. Add PricingConfig to LegioConfig. Default prices hard-coded in the dataclass, so the system works without any TOML pricing section.

4. `telegram/commands.py` — new commands

Command	Output
`/cost`	Today + month + all-time totals with token breakdown
`/cost <name>`	Same, filtered to one centurio
`/budget`	Daily/monthly spend vs. thresholds

Format adapts to billing_mode:

API mode (billing_mode = "api"):

💰 Cost Report
━━━━━━━━━━━━━━━━━━
Today:      $12.34  (820k in / 64k out)
This month: $187.50 (12.5M in / 980k out)
All time:   $412.80 (27.5M in / 2.1M out)

Subscription mode (billing_mode = "subscription"):

💰 Usage Report (API-equivalent)
━━━━━━━━━━━━━━━━━━
Today:      ~$12.34  (820k in / 64k out)
This month: ~$187.50 (12.5M in / 980k out)
All time:   ~$412.80 (27.5M in / 2.1M out)
ℹ️ Actual cost: $200/mo subscription

Plain-text with emoji headers, no LLM. Same pattern as /edicta and /history.

5. Budget alerts

After recording a stipendium, check if daily spend crossed the threshold. If so, send a one-time Telegram alert. Use a simple in-memory flag (_daily_alert_sent: date | None) reset when the date changes.

⚠️ Daily budget alert
Spent $51.20 of $50.00 limit today.
Top spenders: vorenus ($23.40), pullo ($18.90)

File Change Summary

File	Change	Est. LOC
`legio/aerarium.py`	New. Data models, cost calc, SQLite, budget.	~200
`legio/config.py`	Add `PricingConfig`, parse `[pricing]`.	+40
`legio/praetorium.py`	Add `stipendia` table to `_SCHEMA`.	+5
`legio/session.py`	Thread `aerarium` into `collect_response()` and `dispatch()`.	~15 changed
`legio/telegram/commands.py`	Add `/cost` and `/budget` handlers.	+60
`legio/telegram/bot.py`	Register new command handlers.	+5
`legio/errors.py`	Add `AerariumError(LegioError)`.	+3
`legio/__main__.py`	Initialize Aerarium at startup, pass to bot.	+10
`legio.toml`	Add `[pricing]` section with rates.	+15 lines
`tests/test_aerarium.py`	New. Cost calc, persistence, budget, edge cases.	~250
`tests/test_config.py`	Pricing config loading and defaults.	+30
`tests/test_session.py`	Update mocks to thread aerarium.	~20 changed
`tests/test_commands.py`	`/cost` and `/budget` handlers.	+60

What does NOT change

SessionTokenTracker — still needed for session-reset decisions. Different purpose (context window management vs. cost accounting).
Legatus — no routing changes. Cost commands are CommandHandlers.
Centurio prompts — no cost awareness in agents. Pure infrastructure.

Verification

ruff check . && ruff format --check . — clean
pytest --cov --cov-fail-under=100 — 100% including new module
python scripts/check_file_length.py — all files ≤350 LOC
Manual: dispatch centurio → /cost shows non-zero USD
Manual: /cost vorenus shows only that centurio's spend
Manual: set daily_budget_usd = 0.01 → dispatch → budget alert fires
Manual: restart bot → /cost still shows historical data (persistence)

Aerarium: Cost Tracking Module ​

Context ​

API cost vs. subscription cost ​

Design Decisions: Hard-Coded vs. Config vs. Prompted ​

Why NOT prompted? ​

Tracking price changes ​

Domain Vocabulary ​

Architecture ​

New module: legio/aerarium.py ​

SQLite schema (new table in praetorium.db) ​

Config addition in legio.toml ​

Integration Points ​

1. session.py — collect_response() (the choke point) ​

2. praetorium.py — schema extension ​

3. config.py — pricing config loading ​

4. telegram/commands.py — new commands ​

5. Budget alerts ​

File Change Summary ​

What does NOT change ​

Verification ​