Skip to content

Aerarium: Cost Tracking Module

Date: 2026-02-16 Status: Plan Author: Antony

Accurate, persistent token and USD cost tracking for all centurio and legatus API calls, with Telegram reporting commands and budget alerts.


Context

Legio currently tracks only cumulative input_tokens per centurio session (SessionTokenTracker in session.py), solely to decide when to reset a session at 150k tokens. It ignores output tokens (5× more expensive than input), cache tokens, and USD cost entirely. There is no cross-session accounting, no per-centurio cost visibility, no daily/monthly totals, and no budget alerting.

The SDK's ResultMessage provides two relevant fields:

  • usage: dict — raw token breakdown (input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens)
  • total_cost_usd: float | None — populated by Claude Code CLI only, always None in programmatic SDK use. We cannot rely on it, but always store it when present as a cross-check reference.

API cost vs. subscription cost

All cost calculations in Aerarium are API-equivalent cost — what the usage would cost at published per-token rates. This is the only number we can compute from token counts.

Billing modelWhat Aerarium showsReal cost
API (pay-per-token)Actual cost, accurate to the cent= Aerarium number
Subscription (Pro/Team/Max)API-equivalent costFlat monthly fee ($20-200), unrelated to tokens

For subscription users, Aerarium numbers are still valuable:

  • Relative cost between centuriones (who is expensive vs. cheap)
  • Trend tracking (usage growing or shrinking over time)
  • Capacity planning (would API billing be cheaper than subscription?)
  • Budget alerts as consumption caps (regardless of billing model)

The billing_mode config setting controls /cost output language:

  • api mode: 💰 Cost today: $12.34
  • subscription mode: 💰 API-equivalent today: $12.34

Honest about what the number means. Never claims subscription users "spent" money they didn't.


Design Decisions: Hard-Coded vs. Config vs. Prompted

ConcernDecisionRationale
Cost formulaHard-codedPure arithmetic: Σ(tokens × rate) / 1M. Deterministic, no judgment needed.
Usage field namesHard-codedAPI schema is stable (input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens). Infrastructure code.
Token priceslegio.toml with hard-coded defaultsPrices change when new models ship. No Anthropic pricing API exists. Config is the right middle ground — easy to update, version-controlled, defaults protect against missing config.
Model → price mappinglegio.toml [pricing] tableCaesar may use different models for different centuriones later. Config maps model identifiers to rate tiers.
Budget thresholdslegio.tomldaily_budget_usd, monthly_budget_usd — Caesar sets limits.
Alert deliveryHard-codedTelegram message to Caesar. Simple conditional, no LLM.
/cost commandHard-coded handlerLike /status and /history — structured data, not LLM generation.
Report formattingHard-coded templatesToken counts and USD amounts are numbers, not prose.
PersistenceSQLite (new table in praetorium.db)Survives restarts. Same pattern as existing nuntii table.

Why NOT prompted?

Cost reporting is accounting, not conversation. Routing /cost through Legatus would:

  • Add 2-5s latency (API round-trip) to an instant database query
  • Burn tokens to report token burns (ironic)
  • Produce non-deterministic output for deterministic data

Hard-coded handlers return results in milliseconds at zero API cost — same pattern as /status, /edicta, /history.

Tracking price changes

Anthropic has no pricing API. Prices change at model launches (few times per year). Three-layer strategy:

  1. Defaults in code. PricingTier dataclass ships with current prices for known models. Works out of the box, zero config.
  2. Override in legio.toml. [pricing.sonnet], [pricing.haiku] etc. Caesar updates when Anthropic changes prices. TOML beats defaults.
  3. Unknown model fallback. If a centurio uses a model not in the pricing table, log a warning and use a configurable fallback model (default: Sonnet — the most commonly used tier).

Current prices (2026-02-15, from Anthropic official pricing):

ModelInputOutputCache Write (5m)Cache Read
Sonnet 4.5 / 4$3/MTok$15/MTok$3.75/MTok$0.30/MTok
Haiku 4.5$1/MTok$5/MTok$1.25/MTok$0.10/MTok
Opus 4.5 / 4.6$5/MTok$25/MTok$6.25/MTok$0.50/MTok
Opus 4.1 / 4$15/MTok$75/MTok$18.75/MTok$1.50/MTok

Plus: web search at $10/1000 searches (tracked separately if needed).


Domain Vocabulary

Following the Roman naming convention:

  • Aerarium — the treasury; the cost-tracking module. (The Aerarium Saturni was Rome's state treasury in the Temple of Saturn.)
  • Stipendium / Stipendia — a single cost record. (Originally: a soldier's pay or military allowance.)

Architecture

New module: legio/aerarium.py

PricingTier (dataclass, frozen)
├── input_per_mtok: float
├── output_per_mtok: float
├── cache_write_per_mtok: float
├── cache_read_per_mtok: float

PricingConfig (dataclass, frozen)
├── tiers: dict[str, PricingTier]  # model name → tier
├── fallback: str                   # fallback model name
├── billing_mode: str               # "api" or "subscription"
├── daily_budget_usd: float | None
├── monthly_budget_usd: float | None

Stipendium (dataclass)
├── id: str                         # UUID
├── timestamp: datetime
├── sender: str                     # centurio name or "legatus"
├── model: str                      # e.g. "sonnet"
├── input_tokens: int
├── output_tokens: int
├── cache_write_tokens: int
├── cache_read_tokens: int
├── cost_usd: float                 # our calculation (API-equivalent)
├── sdk_cost_usd: float | None      # from ResultMessage.total_cost_usd (reference)
├── session_id: str | None

CostSummary (dataclass)
├── input_tokens: int
├── output_tokens: int
├── cache_write_tokens: int
├── cache_read_tokens: int
├── cost_usd: float
├── request_count: int

BudgetStatus (dataclass)
├── daily_spent: float
├── daily_limit: float | None
├── monthly_spent: float
├── monthly_limit: float | None
├── daily_exceeded: bool
├── monthly_exceeded: bool

Aerarium (class)
├── __init__(db_path: Path, pricing: PricingConfig)
├── async open() → None             # create table if needed
├── async close() → None
├── async record_stipendium(sender, model, usage_dict) → Stipendium
├── async get_summary(sender?, since?) → CostSummary
├── async get_breakdown(since?) → dict[str, CostSummary]
├── async get_budget_status() → BudgetStatus
├── calculate_cost(model, usage_dict) → float   # pure function

SQLite schema (new table in praetorium.db)

sql
CREATE TABLE IF NOT EXISTS stipendia (
    id TEXT PRIMARY KEY,
    timestamp TEXT NOT NULL,
    sender TEXT NOT NULL,
    model TEXT NOT NULL,
    input_tokens INTEGER NOT NULL DEFAULT 0,
    output_tokens INTEGER NOT NULL DEFAULT 0,
    cache_write_tokens INTEGER NOT NULL DEFAULT 0,
    cache_read_tokens INTEGER NOT NULL DEFAULT 0,
    cost_usd REAL NOT NULL DEFAULT 0.0,
    sdk_cost_usd REAL,
    session_id TEXT
);
CREATE INDEX IF NOT EXISTS idx_stipendia_timestamp ON stipendia(timestamp);
CREATE INDEX IF NOT EXISTS idx_stipendia_sender ON stipendia(sender);

Config addition in legio.toml

toml
[pricing]
billing_mode = "api"      # "api" (real cost) or "subscription" (API-equivalent)
fallback = "sonnet"
daily_budget_usd = 50.0
monthly_budget_usd = 500.0

[pricing.sonnet]
input_per_mtok = 3.0
output_per_mtok = 15.0
cache_write_per_mtok = 3.75
cache_read_per_mtok = 0.30

[pricing.haiku]
input_per_mtok = 1.0
output_per_mtok = 5.0
cache_write_per_mtok = 1.25
cache_read_per_mtok = 0.10

[pricing.opus]
input_per_mtok = 5.0
output_per_mtok = 25.0
cache_write_per_mtok = 6.25
cache_read_per_mtok = 0.50

Integration Points

1. session.pycollect_response() (the choke point)

Every SDK response flows through collect_response(). When it sees a ResultMessage, it currently calls tracker.update(msg). We add:

python
if isinstance(msg, ResultMessage):
    if tracker is not None:
        tracker.update(msg)
    if aerarium is not None:
        await aerarium.record_stipendium(
            sender=sender,
            model=model,
            usage=msg.usage,
            sdk_cost_usd=msg.total_cost_usd,  # store when present
        )

The aerarium, sender, and model params are threaded down from CenturioSessionManager.dispatch(). This is the only production code path change in session.py.

2. praetorium.py — schema extension

Add the stipendia table creation to the existing _SCHEMA string. No new database file — reuse praetorium.db. Consistent with the single-database pattern already established.

3. config.py — pricing config loading

Extend load_config() to parse [pricing] from TOML. Add PricingConfig to LegioConfig. Default prices hard-coded in the dataclass, so the system works without any TOML pricing section.

4. telegram/commands.py — new commands

CommandOutput
/costToday + month + all-time totals with token breakdown
/cost <name>Same, filtered to one centurio
/budgetDaily/monthly spend vs. thresholds

Format adapts to billing_mode:

API mode (billing_mode = "api"):

💰 Cost Report
━━━━━━━━━━━━━━━━━━
Today:      $12.34  (820k in / 64k out)
This month: $187.50 (12.5M in / 980k out)
All time:   $412.80 (27.5M in / 2.1M out)

Subscription mode (billing_mode = "subscription"):

💰 Usage Report (API-equivalent)
━━━━━━━━━━━━━━━━━━
Today:      ~$12.34  (820k in / 64k out)
This month: ~$187.50 (12.5M in / 980k out)
All time:   ~$412.80 (27.5M in / 2.1M out)
ℹ️ Actual cost: $200/mo subscription

Plain-text with emoji headers, no LLM. Same pattern as /edicta and /history.

5. Budget alerts

After recording a stipendium, check if daily spend crossed the threshold. If so, send a one-time Telegram alert. Use a simple in-memory flag (_daily_alert_sent: date | None) reset when the date changes.

⚠️ Daily budget alert
Spent $51.20 of $50.00 limit today.
Top spenders: vorenus ($23.40), pullo ($18.90)

File Change Summary

FileChangeEst. LOC
legio/aerarium.pyNew. Data models, cost calc, SQLite, budget.~200
legio/config.pyAdd PricingConfig, parse [pricing].+40
legio/praetorium.pyAdd stipendia table to _SCHEMA.+5
legio/session.pyThread aerarium into collect_response() and dispatch().~15 changed
legio/telegram/commands.pyAdd /cost and /budget handlers.+60
legio/telegram/bot.pyRegister new command handlers.+5
legio/errors.pyAdd AerariumError(LegioError).+3
legio/__main__.pyInitialize Aerarium at startup, pass to bot.+10
legio.tomlAdd [pricing] section with rates.+15 lines
tests/test_aerarium.pyNew. Cost calc, persistence, budget, edge cases.~250
tests/test_config.pyPricing config loading and defaults.+30
tests/test_session.pyUpdate mocks to thread aerarium.~20 changed
tests/test_commands.py/cost and /budget handlers.+60

What does NOT change

  • SessionTokenTracker — still needed for session-reset decisions. Different purpose (context window management vs. cost accounting).
  • Legatus — no routing changes. Cost commands are CommandHandlers.
  • Centurio prompts — no cost awareness in agents. Pure infrastructure.

Verification

  1. ruff check . && ruff format --check . — clean
  2. pytest --cov --cov-fail-under=100 — 100% including new module
  3. python scripts/check_file_length.py — all files ≤350 LOC
  4. Manual: dispatch centurio → /cost shows non-zero USD
  5. Manual: /cost vorenus shows only that centurio's spend
  6. Manual: set daily_budget_usd = 0.01 → dispatch → budget alert fires
  7. Manual: restart bot → /cost still shows historical data (persistence)

Built with Roman discipline.