Aerarium: Cost Tracking Module
Date: 2026-02-16 Status: Plan Author: Antony
Accurate, persistent token and USD cost tracking for all centurio and legatus API calls, with Telegram reporting commands and budget alerts.
Context
Legio currently tracks only cumulative input_tokens per centurio session (SessionTokenTracker in session.py), solely to decide when to reset a session at 150k tokens. It ignores output tokens (5× more expensive than input), cache tokens, and USD cost entirely. There is no cross-session accounting, no per-centurio cost visibility, no daily/monthly totals, and no budget alerting.
The SDK's ResultMessage provides two relevant fields:
usage: dict— raw token breakdown (input_tokens,output_tokens,cache_creation_input_tokens,cache_read_input_tokens)total_cost_usd: float | None— populated by Claude Code CLI only, alwaysNonein programmatic SDK use. We cannot rely on it, but always store it when present as a cross-check reference.
API cost vs. subscription cost
All cost calculations in Aerarium are API-equivalent cost — what the usage would cost at published per-token rates. This is the only number we can compute from token counts.
| Billing model | What Aerarium shows | Real cost |
|---|---|---|
| API (pay-per-token) | Actual cost, accurate to the cent | = Aerarium number |
| Subscription (Pro/Team/Max) | API-equivalent cost | Flat monthly fee ($20-200), unrelated to tokens |
For subscription users, Aerarium numbers are still valuable:
- Relative cost between centuriones (who is expensive vs. cheap)
- Trend tracking (usage growing or shrinking over time)
- Capacity planning (would API billing be cheaper than subscription?)
- Budget alerts as consumption caps (regardless of billing model)
The billing_mode config setting controls /cost output language:
apimode:💰 Cost today: $12.34subscriptionmode:💰 API-equivalent today: $12.34
Honest about what the number means. Never claims subscription users "spent" money they didn't.
Design Decisions: Hard-Coded vs. Config vs. Prompted
| Concern | Decision | Rationale |
|---|---|---|
| Cost formula | Hard-coded | Pure arithmetic: Σ(tokens × rate) / 1M. Deterministic, no judgment needed. |
| Usage field names | Hard-coded | API schema is stable (input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens). Infrastructure code. |
| Token prices | legio.toml with hard-coded defaults | Prices change when new models ship. No Anthropic pricing API exists. Config is the right middle ground — easy to update, version-controlled, defaults protect against missing config. |
| Model → price mapping | legio.toml [pricing] table | Caesar may use different models for different centuriones later. Config maps model identifiers to rate tiers. |
| Budget thresholds | legio.toml | daily_budget_usd, monthly_budget_usd — Caesar sets limits. |
| Alert delivery | Hard-coded | Telegram message to Caesar. Simple conditional, no LLM. |
/cost command | Hard-coded handler | Like /status and /history — structured data, not LLM generation. |
| Report formatting | Hard-coded templates | Token counts and USD amounts are numbers, not prose. |
| Persistence | SQLite (new table in praetorium.db) | Survives restarts. Same pattern as existing nuntii table. |
Why NOT prompted?
Cost reporting is accounting, not conversation. Routing /cost through Legatus would:
- Add 2-5s latency (API round-trip) to an instant database query
- Burn tokens to report token burns (ironic)
- Produce non-deterministic output for deterministic data
Hard-coded handlers return results in milliseconds at zero API cost — same pattern as /status, /edicta, /history.
Tracking price changes
Anthropic has no pricing API. Prices change at model launches (few times per year). Three-layer strategy:
- Defaults in code.
PricingTierdataclass ships with current prices for known models. Works out of the box, zero config. - Override in
legio.toml.[pricing.sonnet],[pricing.haiku]etc. Caesar updates when Anthropic changes prices. TOML beats defaults. - Unknown model fallback. If a centurio uses a model not in the pricing table, log a warning and use a configurable
fallbackmodel (default: Sonnet — the most commonly used tier).
Current prices (2026-02-15, from Anthropic official pricing):
| Model | Input | Output | Cache Write (5m) | Cache Read |
|---|---|---|---|---|
| Sonnet 4.5 / 4 | $3/MTok | $15/MTok | $3.75/MTok | $0.30/MTok |
| Haiku 4.5 | $1/MTok | $5/MTok | $1.25/MTok | $0.10/MTok |
| Opus 4.5 / 4.6 | $5/MTok | $25/MTok | $6.25/MTok | $0.50/MTok |
| Opus 4.1 / 4 | $15/MTok | $75/MTok | $18.75/MTok | $1.50/MTok |
Plus: web search at $10/1000 searches (tracked separately if needed).
Domain Vocabulary
Following the Roman naming convention:
- Aerarium — the treasury; the cost-tracking module. (The Aerarium Saturni was Rome's state treasury in the Temple of Saturn.)
- Stipendium / Stipendia — a single cost record. (Originally: a soldier's pay or military allowance.)
Architecture
New module: legio/aerarium.py
PricingTier (dataclass, frozen)
├── input_per_mtok: float
├── output_per_mtok: float
├── cache_write_per_mtok: float
├── cache_read_per_mtok: float
PricingConfig (dataclass, frozen)
├── tiers: dict[str, PricingTier] # model name → tier
├── fallback: str # fallback model name
├── billing_mode: str # "api" or "subscription"
├── daily_budget_usd: float | None
├── monthly_budget_usd: float | None
Stipendium (dataclass)
├── id: str # UUID
├── timestamp: datetime
├── sender: str # centurio name or "legatus"
├── model: str # e.g. "sonnet"
├── input_tokens: int
├── output_tokens: int
├── cache_write_tokens: int
├── cache_read_tokens: int
├── cost_usd: float # our calculation (API-equivalent)
├── sdk_cost_usd: float | None # from ResultMessage.total_cost_usd (reference)
├── session_id: str | None
CostSummary (dataclass)
├── input_tokens: int
├── output_tokens: int
├── cache_write_tokens: int
├── cache_read_tokens: int
├── cost_usd: float
├── request_count: int
BudgetStatus (dataclass)
├── daily_spent: float
├── daily_limit: float | None
├── monthly_spent: float
├── monthly_limit: float | None
├── daily_exceeded: bool
├── monthly_exceeded: bool
Aerarium (class)
├── __init__(db_path: Path, pricing: PricingConfig)
├── async open() → None # create table if needed
├── async close() → None
├── async record_stipendium(sender, model, usage_dict) → Stipendium
├── async get_summary(sender?, since?) → CostSummary
├── async get_breakdown(since?) → dict[str, CostSummary]
├── async get_budget_status() → BudgetStatus
├── calculate_cost(model, usage_dict) → float # pure functionSQLite schema (new table in praetorium.db)
CREATE TABLE IF NOT EXISTS stipendia (
id TEXT PRIMARY KEY,
timestamp TEXT NOT NULL,
sender TEXT NOT NULL,
model TEXT NOT NULL,
input_tokens INTEGER NOT NULL DEFAULT 0,
output_tokens INTEGER NOT NULL DEFAULT 0,
cache_write_tokens INTEGER NOT NULL DEFAULT 0,
cache_read_tokens INTEGER NOT NULL DEFAULT 0,
cost_usd REAL NOT NULL DEFAULT 0.0,
sdk_cost_usd REAL,
session_id TEXT
);
CREATE INDEX IF NOT EXISTS idx_stipendia_timestamp ON stipendia(timestamp);
CREATE INDEX IF NOT EXISTS idx_stipendia_sender ON stipendia(sender);Config addition in legio.toml
[pricing]
billing_mode = "api" # "api" (real cost) or "subscription" (API-equivalent)
fallback = "sonnet"
daily_budget_usd = 50.0
monthly_budget_usd = 500.0
[pricing.sonnet]
input_per_mtok = 3.0
output_per_mtok = 15.0
cache_write_per_mtok = 3.75
cache_read_per_mtok = 0.30
[pricing.haiku]
input_per_mtok = 1.0
output_per_mtok = 5.0
cache_write_per_mtok = 1.25
cache_read_per_mtok = 0.10
[pricing.opus]
input_per_mtok = 5.0
output_per_mtok = 25.0
cache_write_per_mtok = 6.25
cache_read_per_mtok = 0.50Integration Points
1. session.py — collect_response() (the choke point)
Every SDK response flows through collect_response(). When it sees a ResultMessage, it currently calls tracker.update(msg). We add:
if isinstance(msg, ResultMessage):
if tracker is not None:
tracker.update(msg)
if aerarium is not None:
await aerarium.record_stipendium(
sender=sender,
model=model,
usage=msg.usage,
sdk_cost_usd=msg.total_cost_usd, # store when present
)The aerarium, sender, and model params are threaded down from CenturioSessionManager.dispatch(). This is the only production code path change in session.py.
2. praetorium.py — schema extension
Add the stipendia table creation to the existing _SCHEMA string. No new database file — reuse praetorium.db. Consistent with the single-database pattern already established.
3. config.py — pricing config loading
Extend load_config() to parse [pricing] from TOML. Add PricingConfig to LegioConfig. Default prices hard-coded in the dataclass, so the system works without any TOML pricing section.
4. telegram/commands.py — new commands
| Command | Output |
|---|---|
/cost | Today + month + all-time totals with token breakdown |
/cost <name> | Same, filtered to one centurio |
/budget | Daily/monthly spend vs. thresholds |
Format adapts to billing_mode:
API mode (billing_mode = "api"):
💰 Cost Report
━━━━━━━━━━━━━━━━━━
Today: $12.34 (820k in / 64k out)
This month: $187.50 (12.5M in / 980k out)
All time: $412.80 (27.5M in / 2.1M out)Subscription mode (billing_mode = "subscription"):
💰 Usage Report (API-equivalent)
━━━━━━━━━━━━━━━━━━
Today: ~$12.34 (820k in / 64k out)
This month: ~$187.50 (12.5M in / 980k out)
All time: ~$412.80 (27.5M in / 2.1M out)
ℹ️ Actual cost: $200/mo subscriptionPlain-text with emoji headers, no LLM. Same pattern as /edicta and /history.
5. Budget alerts
After recording a stipendium, check if daily spend crossed the threshold. If so, send a one-time Telegram alert. Use a simple in-memory flag (_daily_alert_sent: date | None) reset when the date changes.
⚠️ Daily budget alert
Spent $51.20 of $50.00 limit today.
Top spenders: vorenus ($23.40), pullo ($18.90)File Change Summary
| File | Change | Est. LOC |
|---|---|---|
legio/aerarium.py | New. Data models, cost calc, SQLite, budget. | ~200 |
legio/config.py | Add PricingConfig, parse [pricing]. | +40 |
legio/praetorium.py | Add stipendia table to _SCHEMA. | +5 |
legio/session.py | Thread aerarium into collect_response() and dispatch(). | ~15 changed |
legio/telegram/commands.py | Add /cost and /budget handlers. | +60 |
legio/telegram/bot.py | Register new command handlers. | +5 |
legio/errors.py | Add AerariumError(LegioError). | +3 |
legio/__main__.py | Initialize Aerarium at startup, pass to bot. | +10 |
legio.toml | Add [pricing] section with rates. | +15 lines |
tests/test_aerarium.py | New. Cost calc, persistence, budget, edge cases. | ~250 |
tests/test_config.py | Pricing config loading and defaults. | +30 |
tests/test_session.py | Update mocks to thread aerarium. | ~20 changed |
tests/test_commands.py | /cost and /budget handlers. | +60 |
What does NOT change
SessionTokenTracker— still needed for session-reset decisions. Different purpose (context window management vs. cost accounting).Legatus— no routing changes. Cost commands areCommandHandlers.- Centurio prompts — no cost awareness in agents. Pure infrastructure.
Verification
ruff check . && ruff format --check .— cleanpytest --cov --cov-fail-under=100— 100% including new modulepython scripts/check_file_length.py— all files ≤350 LOC- Manual: dispatch centurio →
/costshows non-zero USD - Manual:
/cost vorenusshows only that centurio's spend - Manual: set
daily_budget_usd = 0.01→ dispatch → budget alert fires - Manual: restart bot →
/coststill shows historical data (persistence)