Skip to content

Architecture Overview

Legio follows a layered architecture with clear separation between domain logic and infrastructure. Dependencies flow strictly downward — data models at the bottom, infrastructure at the top.

System Topology

Three Layers

Code (Python Infrastructure)

The legio/ package — domain models, orchestration, Telegram integration. Behavior that cannot be expressed in prompts.

Blueprints (Templates)

The blueprints/ directory — default prompts and tool configs used when creating new centuriones. Committed to git, shared across deployments. Consumed once at centurio creation time.

Castra (Runtime State)

The castra/ directory — the live workspace. Centurio prompts (customized), edicta, acta, commentarii, and the praetorium database.

Module Map

Layer 0 — Foundation

ModuleLinesPurpose
errors.py~35LegioError hierarchy — 7 domain exception classes

Layer 1 — Data Models

ModuleLinesPurpose
nuntius.py~50Immutable frozen dataclass with UUID4, sender, audience, timestamp
centurio.py~80Agent identity, name validation (^[a-z][a-z0-9_-]*$), reserved name rejection, filesystem path properties
config.py~120LegioConfig loaded from legio.toml + environment variables (frozen dataclass)

Layer 2 — Storage

ModuleLinesPurpose
praetorium.py~170SQLite message bus with WAL mode, visibility rules, over-fetch heuristic
rendering.py~115XML context formatting (format_history_xml), template rendering, attribution headers
memoria/store.py~230Filesystem CRUD for edicta, acta, commentarii (XML format with name/symlink validation)
auctoritas.py~130Pending TOTP request store — TTL, attempt tracking, auto-cleanup
totp.py~60TOTP verification via pyotp — timing-safe comparison, drift tolerance

Layer 3 — Orchestration

ModuleLinesPurpose
session.py~390SDK session lifecycle — create, dispatch, idle reap, token tracking, response collection
memoria/tools.py~150MCP tool wrappers — full server (Legatus, 17 tools) and scoped server (centurio, 8 tools)

Layer 4 — Coordinator

ModuleLinesPurpose
legatus.py~465Orchestrator — @mention parsing, centurio dispatch, LLM default response, MCP tool definitions (6 tools), roster/prompt change detection

Layer 5 — Infrastructure

ModuleLinesPurpose
telegram/bot.py~100Application builder, handler registration (12 handlers), _caesar_filter
telegram/commands.py~150Slash command handlers (/remove, /edict, /revoke, etc.), TOTP gating
telegram/utils.py~245Reply helpers, typing indicators, live status, reply context extraction
telegram/markdown_render.py~148Mistune v3 renderer for Telegram-compatible HTML (HTMLRenderer(escape=True))

Legatus: The Orchestrator

The Legatus class wears two hats:

  1. Orchestrator (code) — message parsing, @mention extraction, centurio dispatch
  2. Default responder (SDK agent) — when no centurio is mentioned, responds using its own Claude SDK session

State

python
class Legatus:
    _config: LegioConfig           # Frozen config
    _praetorium: Praetorium        # Message bus
    _memoria: MemoriaStore         # Filesystem storage
    _memoria_full: MemoriaServer   # Full MCP server (17 tools)
    _client: ClaudeSDKClient | None  # Legatus's own SDK session
    _client_is_fresh: bool         # True until first query
    _roster_hash: str              # SHA-256 of centurio names + descriptions
    _prompt_mtime: float           # Last modified time of legatus/prompt.md
    _session_mgr: CenturioSessionManager
    _centuriones: dict[str, Centurio]

Mention Extraction

The _MENTION_RE regex extracts @names from free-form text:

python
# SECURITY: negative lookbehind prevents matching @name inside emails/URLs
_MENTION_RE = re.compile(r"(?<![.\w/@])@(\w+)")

Only names that match registered centuriones (case-insensitive) are returned. Unrecognized @mentions are ignored silently.

Client Rebuild Triggers

The Legatus SDK client is rebuilt when either condition is true:

TriggerDetectionEffect
Roster changeSHA-256 hash of name:description pairsNew system prompt with updated <centuriones> roster XML
Prompt editstat().st_mtime of legatus/prompt.mdNew system prompt from edited file

The check runs on every handle_message() call via _should_rebuild_client().

System Prompt Construction

The system prompt is built from the legatus prompt.md file plus an injected centurio roster:

xml
[Contents of castra/legatus/prompt.md]

<centuriones>
  <centurio name="vorenus">Research specialist for technology analysis</centurio>
  <centurio name="brutus">Code review and security auditing</centurio>
</centuriones>

Both names and descriptions are XML-escaped to prevent injection from prompt.md content.

Per-Query Status Injection

Real-time centurio status is injected into every user message (not the system prompt) so it stays fresh:

xml
<centurio_status>
  <centurio name="vorenus" status="idle"/>
  <centurio name="brutus" status="working"/>
</centurio_status>

Caesar's actual message here

MCP Tools (Legatus)

The Legatus has 6 built-in tools plus all Memoria tools:

ToolParametersPurpose
create_centurioname, specializationCreate from blueprint template
remove_centurionameRemove directory + disconnect session
list_centurionesScan registry and return names/statuses
dispatch_to_centurioname, messageSend a nuntius to a specific centurio
post_nuntiustext, audiencePost directly to the praetorium
get_historylimitRead recent message history

Plus 11 Memoria tools (edicta, acta, commentarii CRUD) — total of 17 MCP tools.

History Bootstrap

On a fresh client (startup, rebuild), the first query prepends praetorium history as XML context:

xml
<praetorium recent="true" viewer="legatus">
  <nuntius id="uuid1" sender="caesar" timestamp="2025-01-15T10:30:00+00:00">
    Research quantum computing advances
  </nuntius>
  <nuntius id="uuid2" sender="vorenus" timestamp="2025-01-15T10:31:00+00:00">
    Based on my research, the latest advances include...
  </nuntius>
</praetorium>

<context_notice>Session restored from praetorium. Ask Caesar for clarification if context is unclear.</context_notice>

<centurio_status>
  <centurio name="vorenus" status="idle"/>
</centurio_status>

Caesar's new message here

After the first query, _client_is_fresh is cleared and subsequent messages only include status XML.

Message Flow

Free-Form Message (No @mentions)

@Mention Dispatch

Handle Message Flow

The handle_message() method is the main entry point:

  1. Scanscan_centuriones() refreshes the registry from filesystem
  2. Parseextract_mentions(text) finds @names matching registered centuriones
  3. Post — Caesar's nuntius posted to praetorium with correct audience
  4. Rebuild_ensure_legatus_client() checks roster hash + prompt mtime
  5. Route — if @mentions → parallel dispatch; else → legatus LLM
  6. Respond — post responses to praetorium with correct sender
  7. Attribute — prepend ⚔️ name — description header for centurio responses

Dependency Graph

No circular imports. Infrastructure never leaks into domain code. The domain layer has zero knowledge of Telegram.

Configuration

Settings come from two sources:

legio.toml (non-secrets)

toml
[caesar]
telegram_id = 123456789

[legio]
model = "sonnet"
castra_dir = "castra"
max_centuriones = 10
history_window = 50
session_idle_timeout_minutes = 30

Environment Variables (secrets)

bash
TELEGRAM_BOT_TOKEN=...      # Telegram Bot API token
ANTHROPIC_API_KEY=...       # Claude API key
LEGIO_TOTP_SECRET=...       # TOTP secret for destructive actions

Loaded once via load_config() into a frozen LegioConfig dataclass, then passed through constructors. No global state, no mutable singletons.

Configuration Flow

Built with Roman discipline.