Skip to content

Telegram Chat UX — Legio Command Interface

A professional UX design for managing a legion of AI agents through Telegram.


1. Command Structure & Mention Patterns

Current Commands (keep as-is)

CommandPurpose
/statusShow centuriones with status icons
/listList centuriones with descriptions
/create <name> <desc>Create a new centurio
/helpShow command reference

Proposed New Commands

CommandPurposeExample
/edict <text>Publish a standing order/edict Always write tests before code
/edictaList all standing orders/edicta
/revoke <name>Revoke an edictum/revoke no-console-logs
/actaShow recent shared knowledge/acta
/history [n]Show last N praetorium messages/history 20
/reset <name>Reset a centurio's session/reset vorenus
/remove <name>Remove a centurio (requires confirmation)/remove vorenus

Mention Patterns (keep current — already well-designed)

@vorenus do this                 → single centurio
@vorenus @pullo do this          → parallel dispatch
Just do this                     → Legatus routes (or handles itself)

Trade-off: Flat vs. Hierarchical Commands

Recommendation: stay flat. Telegram's command autocomplete only shows top-level commands. Subcommands (e.g., /centurio create, /centurio remove) won't appear in suggestions and feel unnatural in chat. Flat commands (/create, /remove) are faster to type and align with the conversational paradigm.


2. Agent Identity & Status Signaling

The Constraint

Telegram bots cannot change their display name or avatar per message. A single bot token = one identity. To simulate multiple centuriones speaking through one bot, identity must be conveyed in-message.

Current Pattern (⚔️ attribution header)

⚔️ vorenus

Your response here...

Proposed Enhancement: Richer Headers with Role Context

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Your response here...
  • Name + one-line role reminds Caesar who this centurio is
  • Thin separator (Unicode ) visually groups header from content
  • Legatus responses have no header (the default voice — like a CO speaking directly)

Status Icons (unified vocabulary)

IconStatusMeaning
idleReady for orders
workingProcessing a task
errorLast task failed
💤disconnectedSession reaped (idle timeout)
🆕newJust created, never used

Live Status Updates (already implemented)

The _make_status_callback pattern is good. Proposed refinements to format_status():

⏳ Legatus is thinking...
⏳ vorenus is using search_web...
⏳ vorenus is writing to commentarii...
⏳ Dispatching to pullo...

Use natural English ("is using...") instead of bare tool names — Caesar doesn't care about internal tool identifiers.


3. Inter-Agent Communication Protocols

Current Architecture

Caesar ──→ Legatus ──→ Centurio(nes)

              Praetorium (message bus)

All communication is mediated by Legatus — centuriones cannot talk to each other directly. This is correct and should stay.

Communication Paths

PathMechanismCaesar Sees
Caesar → LegatusFree-form text
Caesar → Centurio@mention in text
Legatus → Centuriodispatch_to_centurio toolLive status update
Centurio → CaesarResponse via LegatusAttributed message
Centurio → CenturioVia acta (shared knowledge)Nothing (async, pull-based)
Legatus → CaesarDirect responseUnheadered message

Proposed: Visible Routing Narration

When Legatus auto-routes (no @mention), show a brief narration before dispatching:

Caesar: Research the latest Claude SDK changes

Legatus: Routing to @vorenus (Code Specialist) — this is a technical research task.

⏳ vorenus is working...

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Here are the recent Claude SDK changes...

This helps Caesar understand why Legatus chose a particular centurio. Implemented purely via the Legatus prompt — no code change needed.

Proposed: Cross-Centurio Handoff Pattern

Add to Legatus prompt: when one centurio's response suggests work for another, Legatus can chain dispatches. Caesar sees each step:

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Found 3 bugs in the auth module. Recommending pullo review the test coverage.

⏳ Legatus is dispatching to pullo...

⚔️ pullo — Test Engineer
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Reviewed auth module tests. Added 5 missing test cases. Coverage now 100%.

This is achievable entirely through the Legatus prompt: instruct it to read centurio responses and decide whether to chain another dispatch.


4. Authorization Workflows (OTP via Prompt)

Design: Prompt-Based Confirmation Protocol

Following the "prompt over code" philosophy, dangerous actions are gated by a confirmation protocol defined entirely in the Legatus system prompt.

Dangerous Action Categories

ActionRisk LevelConfirmation Required
remove_centurio🔴 HighFull confirmation
revoke_edictum🟡 MediumFull confirmation
create_centurio🟢 LowNone (reversible)
publish_edictum🟢 LowNone (revocable)
dispatch_to_centurio🟢 LowNone (read-only task)

Confirmation Protocol (added to Legatus prompt)

markdown
## Dangerous Actions — Require Caesar's Confirmation

The following actions are DESTRUCTIVE and IRREVERSIBLE:
- Removing a centurio (`remove_centurio`) — deletes all centurio data
- Revoking an edictum (`revoke_edictum`) — removes a standing order

Before executing any destructive action:
1. Explain clearly WHAT will be lost and WHY it cannot be undone.
2. Show the exact action: "I will remove centurio: vorenus"
3. Ask Caesar to confirm by replying: "Confirmed"
4. ONLY call the tool AFTER receiving "Confirmed" as a direct reply.
5. If Caesar says anything else, ABORT and acknowledge the cancellation.

Never combine a destructive action with other actions in the same turn.

Example Interaction

Caesar: Remove vorenus, he's not useful anymore

Legatus: I will remove centurio vorenus. This will:
  • Delete his prompt, tools config, and commentarii
  • Disconnect his active session
  • This cannot be undone.

  Reply "Confirmed" to proceed.

Caesar: Confirmed

Legatus: ✅ Removed centurio: vorenus

Trade-off: Prompt vs. Code Enforcement

ApproachProsCons
Prompt-only (recommended)Zero code changes, flexible, easy to adjust rulesLLM could skip in edge cases; no hard guarantee
Code-level OTPHard guarantee, tamper-proofMore code, more tests, breaks "prompt over code" philosophy
HybridBest of bothOver-engineered for single-user system

Recommendation: Prompt-only for now. Caesar is the sole user — prompt adherence is sufficient. If the legion grows to serve multiple users, revisit with code-level enforcement.


5. Message Threading & Context Preservation

Telegram Constraint

Private chats are flat — no native threading. But reply_to_message_id creates visual context chains.

Proposed: Reply-to-Original Threading

When a centurio responds, reply to Caesar's original message:

Caesar: @vorenus check the auth module       ← message_id: 42
    └── ⚔️ vorenus — Code Specialist         ← reply_to: 42
        Found 3 issues in auth...

This creates a visual thread in Telegram. Caesar can tap the reply header to jump back to the original request.

Proposed: Context Summary for Long Conversations

When a session resets (token limit), the centurio loses context. Add to centurio prompt:

markdown
When you notice a conversation context reset, briefly acknowledge it:
"(Session refreshed — I've lost our earlier conversation.
 Check my commentarii for any notes I saved.)"

Proposed: Pin Important Results

For significant outputs (research reports, audit results), Legatus can pin the message:

Caesar: @vorenus comprehensive audit of the codebase

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📌 Audit Report: Legio Codebase
[... detailed findings ...]

(Legatus pins this message)

This requires the pin_chat_message API call — a small code addition in bot.py.


6. Scalability Considerations

Current Limits

ConstraintCurrentCeiling
Max centuriones10 (config)Practical: ~20-30 before UX degrades
Parallel dispatchUnlimited (asyncio.gather)Telegram rate limits: ~30 msg/sec global
Session timeout30 min idleAdjustable via config
Token budget200k per centurioSDK limit; auto-reset at 150k

UX Degradation Points

5-10 centuriones: Current UX works fine. /status fits on one screen. @mentions are manageable.

10-20 centuriones: /status gets long. Need grouping:

📋 Centurio Status

Code & Engineering
  ✅ vorenus — Code Specialist
  ⏳ pullo — Test Engineer

Research & Analysis
  ✅ cicero — Research Analyst
  💤 brutus — Data Analyst

Operations
  ✅ octavia — DevOps

This is a prompt-only change — instruct Legatus to categorize centuriones when displaying status.

20+ centuriones: Inline keyboards become valuable:

📋 Select a centurio:
[vorenus] [pullo] [cicero]
[brutus]  [octavia] [marcus]

This requires code (inline keyboard in bot.py) but scales better than typed @mentions.

Response Time Budget

OperationCurrentTarget
First ⏳ acknowledgment<100ms<100ms ✅
First status update~3s (debounce)~2s
Single centurio response5-30s5-30s (SDK dependent)
Parallel dispatch (3 centuriones)5-30s (slowest wins)Same
Full round-trip with Legatus routing10-60s10-60s (two SDK calls)

Proposed: Timeout Expectations

Add to Legatus prompt:

markdown
If a task will take significant processing time (>30 seconds),
tell Caesar upfront: "This may take a minute — I'm dispatching to vorenus."

7. Concrete Interaction Examples

Example 1: Simple Direct Message (no routing)

Caesar: What centuriones do I have?

Legatus: You currently have 3 centuriones:
  • vorenus — Code Specialist (✅ idle)
  • pullo — Test Engineer (✅ idle)
  • cicero — Research Analyst (💤 disconnected)

Example 2: Single Mention

Caesar: @vorenus refactor the auth module to use async

⏳ vorenus is reading edicta...
⏳ vorenus is analyzing legio/auth.py...
⏳ vorenus is writing code...

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Refactored auth module to async. Changes:
• Converted 4 functions to async/await
• Updated 12 call sites
• All tests passing

Wrote details to commentarii for reference.

Example 3: Multi-Mention Parallel

Caesar: @vorenus @pullo audit the memoria module — code quality and test coverage

⏳ [vorenus] Reading edicta...
⏳ [pullo] Reading edicta...
⏳ [vorenus] Analyzing memoria/store.py...
⏳ [pullo] Running pytest --cov...

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Code quality: 8/10. Found 2 minor issues...

⚔️ pullo — Test Engineer
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Test coverage: 100%. 94 tests, all passing...

Example 4: Auto-Routing by Legatus

Caesar: I need a deep analysis of our competitor's API documentation

⏳ Legatus is thinking...

Legatus: Routing to @cicero (Research Analyst) — this is a research task.

⏳ cicero is searching the web...
⏳ cicero is writing to commentarii...

⚔️ cicero — Research Analyst
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Here's my analysis of the competitor API...

Example 5: Dangerous Action with Confirmation

Caesar: Remove cicero

Legatus: I will remove centurio cicero. This will:
  • Delete prompt.md, tools.json, and all commentarii
  • Disconnect the active session
  • This action cannot be undone.

  Reply "Confirmed" to proceed.

Caesar: Confirmed

Legatus: ✅ Removed centurio: cicero

Example 6: Error Recovery

Caesar: @vorenus deploy to production

⏳ vorenus is working...

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
❌ Deployment failed: 3 tests are failing in test_auth.py.
I've written the error details to my commentarii.
Recommend fixing the tests before deploying.

8. Implementation Priority

All items are pure prompt changes unless marked otherwise.

PriorityItemTypeEffort
1Dangerous action confirmation protocolPrompt10 min
2Richer attribution headers (name + role + separator)Code (rendering)30 min
3Routing narration for auto-dispatchPrompt10 min
4Humanized status messages ("is using..." not tool names)Code (format_status)20 min
5Reply-to-original threadingCode (bot.py)30 min
6New commands (/edict, /edicta, /history, etc.)Code (bot.py)2 hr
7Centurio grouping in /statusPrompt10 min
8Pin important resultsCode (bot.py)30 min
9Inline keyboard for centurio selectionCode (bot.py)2 hr

9. Summary of Trade-offs

DecisionChoiceRationale
Flat vs. hierarchical commandsFlatTelegram autocomplete; conversational UX
Per-agent bot vs. single botSingle botSimpler ops; identity via headers
Code vs. prompt OTPPrompt"Prompt over code"; single-user system
Inline keyboards vs. @mentions@mentions now, keyboards later@mentions work at current scale
Forum topics vs. flat chatFlat chatSimpler; reply_to provides enough context
Streaming edits vs. batchBatch (3s debounce)Telegram rate limits; already implemented

Built with Roman discipline.