Skip to content

Telegram UX Implementation Plan

Refined from Caesar's full plan (20260215-0905) and Antony's UX research (20260215-1600).

Guiding principle: ship the highest-value items first, defer complexity until scale demands it.


Scope: What We Build Now vs. Later

Phase 1 — Now (this plan)

WIItemTypeSize
AAttribution headers with role contextCodeS
BHumanized status messagesCodeS
CReply-to-original threadingCodeS
DNew commands: /remove, /edict, /edicta, /revoke, /acta, /history, /resetCodeM
ETOTP authorization gate for destructive actionsCodeL
FDangerous action confirmation (non-TOTP tier)PromptS
GRouting narration and consult/debate protocolsPromptS

Phase 2 — Deferred (build when needed)

ItemTrigger to Build
Mission context + topic supportCaesar reports friction with flat chat at 10+ centuriones
Mission persistence + /assignSame trigger as above
Inline keyboards for selection15+ centuriones; @mentions become unwieldy
Legatus status panel (edited-in-place)Caesar requests a persistent dashboard
Centurio grouping in /status10+ centuriones

Rationale: The mission system (WI-001/002 in Caesar's plan) is the largest piece of work and solves a scalability problem that doesn't exist yet at ≤10 centuriones. The current @mention pattern is clean and sufficient. TOTP and attribution deliver immediate value.


Current Behavior Inventory

Caesar (Telegram) ──→ TelegramBot._handle_message()
                        ├── reply_text("⏳")  →  status_msg
                        ├── _keep_typing()     →  typing indicator
                        ├── _make_status_callback(status_msg)
                        └── Legatus.handle_message(text, user_id, on_status=...)
                              ├── @mentions → dispatch_parallel() → attributed responses
                              └── no mentions → _query_legatus() → legatus response

Responses: status_msg.edit_text(first_chunk), reply_text(remaining_chunks)
Attribution: "⚔️ {name}\n\n{text}" (hardcoded in legatus.py:361)

Files Touched

FileCurrent LOCRole
legio/telegram/bot.py157Telegram handlers, commands
legio/legatus.py229Orchestrator, tools, attribution
legio/session.py152SDK sessions, format_status
legio/rendering.py37XML/template rendering
legio/config.py29Config loading
legio/centurio.py37Centurio data model
legio/errors.py6Domain exceptions

WI-A: Attribution Headers with Role Context

Goal

Replace the minimal ⚔️ vorenus header with a richer format that shows specialization.

Current

⚔️ vorenus

Response text here...

Target

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Response text here...
  • Name + first non-heading line from prompt.md (already extracted by get_centurio_description())
  • Unicode thin line separator ( repeated)
  • Legatus responses have no header — the default voice

Implementation

  • Move attribution formatting from legatus.py:361 to rendering.py as render_attribution_header(name, description).
  • Call from legatus.handle_message() when assembling attributed responses.
  • HTML-escape both name and description (already done for name via _safe_html in bot.py).

Tests

  • tests/test_rendering.py: header format, HTML escaping, empty description fallback.
  • tests/test_legatus.py: attributed responses include separator line.

Estimate: S


WI-B: Humanized Status Messages

Goal

Replace raw tool identifiers in status updates with natural English.

Current

🔧 Using search_web...
🔧 Dispatching to vorenus...

Target

⏳ Searching the web...
⏳ Dispatching to vorenus...
⏳ Reading edicta...
⏳ Writing to commentarii...
⏳ Thinking...

Implementation

  • Update format_status() in session.py:
    • Map known tool names to human descriptions via a dict: {"search_web": "Searching the web", "list_edicta": "Reading edicta", "write_commentarium": "Writing to commentarii", ...}
    • Unknown tools: "Using {name}..." (fallback)
    • Use ⏳ consistently (not 🔧) to match the status_msg initial emoji
  • Update _make_prefixed_callback prefix format from [name] to ⏳ [name] for parallel dispatch.

Tests

  • Update TestFormatStatus with new expected strings.
  • Add test for unknown tool fallback.

Estimate: S


WI-C: Reply-to-Original Threading

Goal

Centurio responses reply to Caesar's original message, creating visual context chains in Telegram.

Current

All responses edit the status_msg or send as new messages — no reply-to linking.

Target

Caesar: @vorenus check the auth module       ← message_id: 42
    └── ⚔️ vorenus — Code Specialist         ← reply_to: 42
        Found 3 issues...

Implementation

  • In bot.py:_handle_message(), pass update.message.message_id through to response sending.
  • When sending continuation responses (after the first status_msg edit), use reply_to_message_id=original_message_id.
  • The first response still edits status_msg (this is the ⏳ → result transition).
  • Second+ responses (multi-centurio, long splits) reply to the original.

Tests

  • tests/test_telegram_bot.py: verify reply_to_message_id is set on continuation messages.

Estimate: S


WI-D: New Telegram Commands

Goal

Expose memoria and management operations as Telegram commands.

New Commands

CommandActionAuthorization
/remove <name>Remove a centurioTOTP (WI-E)
/edict <name> <text>Publish standing orderPrompt confirmation (WI-F)
/edictaList all standing ordersNone
/revoke <name>Revoke an edictumTOTP (WI-E)
/actaShow recent shared knowledgeNone
/history [n]Show last N praetorium nuntiiNone
/reset <name>Reset a centurio's sessionNone

Implementation

  • Add 7 command handlers in bot.py.
  • Read-only commands (/edicta, /acta, /history) call memoria/praetorium directly and format output.
  • /reset calls session_mgr.disconnect(name) (new method — disconnect and remove session so next dispatch creates a fresh one).
  • /remove and /revoke delegate to the TOTP flow (WI-E).
  • /edict delegates to handle_message with prompt-level confirmation (WI-F).
  • Register all commands via setMyCommands at startup for autocomplete.

File size concern

bot.py is at 157 LOC. Adding 7 handlers will push it toward 300+. Plan to split: extract command handlers into legio/telegram/commands.py if the file crosses 300 LOC during implementation.

Tests

  • One test class per command in tests/test_telegram_bot.py.
  • Caesar-only gate test for each.

Estimate: M


WI-E: TOTP Authorization Gate

Goal

Require a Google Authenticator OTP for destructive actions. This is the one place where code enforcement overrides prompt, because irreversible actions need a hard guarantee.

Design

New vocabulary term: auctoritas (Latin: authorization, authority). Plural: auctoritates.

New data model: An auctoritas represents a pending authorization request bound to a specific action.

python
@dataclass
class Auctoritas:
    id: str                    # UUID
    action: str                # e.g., "remove_centurio"
    payload: dict[str, str]    # e.g., {"name": "vorenus"}
    chat_id: int
    message_id: int            # OTP request message (for deletion)
    expires_at: datetime
    attempts: int = 0

Flow:

1. Caesar: /remove vorenus
2. Bot creates Auctoritas, sends OTP request message:
   "🔐 Removing centurio vorenus requires authorization.
    Enter your 6-digit OTP:"
3. Caesar: 123456
4. Bot verifies TOTP → if valid, executes action, deletes OTP messages (best-effort)
5. Bot: "✅ Removed centurio: vorenus"
   (or "❌ Invalid OTP. 2 attempts remaining.")

State machine:

pending → approved → executed
       → expired (TTL)
       → rejected (max attempts)

New Files

FilePurpose
legio/totp.pyTOTP verifier wrapping pyotp
legio/auctoritas.pyAuctoritas data model + in-memory store

Config Changes

python
# config.py — new fields on LegioConfig
totp_secret: str = ""   # from env: LEGIO_TOTP_SECRET (base32)
totp_required_actions: tuple[str, ...] = ("remove_centurio", "revoke_edictum")
  • LEGIO_TOTP_SECRET from .env (required for TOTP; if absent, TOTP actions are disabled with warning).
  • totp_required_actions from [security] section in legio.toml. Hardcode defaults; only override if Caesar wants custom policy.
  • Hardcode: totp_ttl_seconds=120, totp_max_attempts=3, totp_drift_steps=1. No config knobs for these — they rarely change and adding them now is premature.

New Dependency

  • pyotp — standard TOTP/HOTP library, Google Authenticator compatible.

In-Memory vs. Database

Store auctoritates in-memory (dict keyed by chat_id). Rationale:

  • Single-user system — only one pending auctoritas at a time is realistic.
  • Expiry is 2 minutes — no persistence needed across restarts.
  • Avoids schema migration complexity.
  • If persistence becomes needed later, migrate to praetorium table.

Security Controls

  • # SECURITY: TOTP secret loaded from env only, never logged or persisted to disk
  • # SECURITY: OTP messages deleted after verification (best-effort)
  • # SECURITY: auctoritas expires after TTL regardless of attempts
  • # SECURITY: timing-safe comparison for TOTP codes
  • OTP reply messages are also deleted (Caesar's typed OTP code).

Tests

FileTests
tests/test_totp.py (new)Valid OTP accepted, invalid rejected, drift tolerance, timing-safe comparison
tests/test_auctoritas.py (new)Create, expire, max attempts, approve, execute
tests/test_config.pyLEGIO_TOTP_SECRET loaded, missing secret disables TOTP
tests/test_telegram_bot.pyFull flow: request → OTP → execution; invalid OTP; expired request

Estimate: L


WI-F: Prompt-Level Confirmation for Medium-Risk Actions

Goal

Actions that are revocable but impactful require Caesar to reply "Confirmed" — enforced via Legatus prompt, not code.

Actions

ActionWhy Not TOTP
publish_edictumRevocable (can /revoke)

Implementation

Add to castra/legatus/prompt.md:

markdown
## Action Authorization

### TOTP-Required (Hard Gate)
These actions are blocked until Caesar provides a valid OTP.
The bot will ask for it automatically — you cannot bypass this:
- Removing a centurio
- Revoking an edictum

### Confirmation-Required (Soft Gate)
Before these actions, explain the impact and ask Caesar to reply "Confirmed":
- Publishing an edictum (standing order for all centuriones)

### No Confirmation Needed
- Creating a centurio (reversible)
- Dispatching to a centurio (read-only)
- Reading edicta, acta, history (read-only)

Estimate: S (prompt edit only, no code)


WI-G: Routing Narration and Consult/Debate Protocols

Goal

Add behavioral protocols to Legatus for visible routing, one-shot consult, and structured debate. All via prompt — no orchestration code.

Implementation

Add to castra/legatus/prompt.md:

markdown
## Routing Narration

When you auto-route (no @mentions from Caesar):
1. Briefly explain which centurio you chose and why (one sentence).
2. Then dispatch.
Caesar should never be surprised about who is working on their request.

## Consult Protocol

When Caesar uses `/consult @name <question>`:
1. Dispatch the question to the named centurio.
2. Return their answer with attribution.
3. Add a one-sentence Legatus summary if the answer needs interpretation.

## Debate Protocol

When Caesar uses `/debate @a @b <question>`:
1. Dispatch the question to both centuriones.
2. If their answers conflict, dispatch one round of rebuttal (each sees the other's answer).
3. Produce a final synthesis: options, recommendation, and risks.
4. Maximum 3 rounds total. Stop even if no consensus.
5. Caesar sees the synthesis. Raw transcripts are saved to your commentarii.

Why Prompt, Not Code

The debate "rounds" are a behavioral pattern, not a state machine. The Legatus SDK session maintains conversation context — it can count rounds itself. A code-level hard timeout (already exists: SDK token limit + session idle timeout) prevents runaway debates.

Estimate: S (prompt edits only)


New Domain Vocabulary

ConceptSingularPluralPython ClassPython Var
Authorization requestAuctoritasAuctoritatesAuctoritasauctoritas / auctoritates

Add to 00-domain-vocabulary.md and dev-docs/memos/20260215-1700-domain-vocabulary.md during WI-E.


Implementation Order

WI-A (S) → Attribution headers         Zero dependencies, immediate visual improvement
WI-B (S) → Humanized status            Zero dependencies, pairs with WI-A
WI-C (S) → Reply-to threading          Zero dependencies, context improvement
WI-F (S) → Prompt confirmation          Zero dependencies, prompt-only
WI-G (S) → Routing/consult/debate       Zero dependencies, prompt-only
WI-D (M) → New commands                 Needs WI-E for /remove and /revoke
WI-E (L) → TOTP gate                    Needs pyotp, new files, new tests

Parallel: WI-A + WI-B + WI-C can be done in a single pass (all touch bot.py/rendering.py/session.py). Parallel: WI-F + WI-G can be done together (both are prompt edits to legatus/prompt.md). Then: WI-D + WI-E together (commands + TOTP gate are tightly coupled for /remove and /revoke).

Estimated Total

SizeCountTime
S5~2 hours
M1~2 hours
L1~4 hours
Total7~8 hours

Decision Log

#DecisionRationale
D1Keep single botOperational simplicity; identity via in-message headers
D2Flat commands (no subcommands)Telegram autocomplete; conversational UX
D3TOTP in code for destructive; prompt for revocableHard guarantee where it matters; flexibility where it doesn't
D4Defer mission/topic systemSolves a scale problem that doesn't exist yet (≤10 centuriones)
D5Defer inline keyboards@mentions work at current scale; keyboards add code complexity
D6Debate protocol in prompt, not code"Prompt over code"; SDK session tracks rounds naturally
D7In-memory auctoritas storeSingle user, 2-min TTL; no persistence needed
D8Hardcode TOTP tuning paramsttl=120, max_attempts=3, drift=1 — rarely change; avoid config bloat

Testing Procedures

  • After each WI: ruff check . && ruff format --check . && python scripts/check_file_length.py && pytest
  • Full gate: pytest --cov --cov-fail-under=100
  • Security tests: pytest -m security (TOTP verification, Caesar-only gates, input validation)

Manual Test Checklist

  • [ ] /help lists all new commands
  • [ ] Send @vorenus do this → response has rich attribution header with role
  • [ ] Send message → status updates show humanized text (not raw tool names)
  • [ ] Multi-centurio response → each reply threads to original message
  • [ ] /remove vorenus → OTP prompt appears → valid OTP executes → messages deleted
  • [ ] /remove vorenus → wrong OTP three times → request rejected
  • [ ] /edict test-rule Always test first → Legatus asks "Confirmed" → publishes
  • [ ] /edicta → lists standing orders
  • [ ] /history 5 → shows last 5 nuntii

Built with Roman discipline.