Telegram Chat UX — Legio Command Interface
A professional UX design for managing a legion of AI agents through Telegram.
1. Command Structure & Mention Patterns
Current Commands (keep as-is)
| Command | Purpose |
|---|---|
/status | Show centuriones with status icons |
/list | List centuriones with descriptions |
/create <name> <desc> | Create a new centurio |
/help | Show command reference |
Proposed New Commands
| Command | Purpose | Example |
|---|---|---|
/edict <text> | Publish a standing order | /edict Always write tests before code |
/edicta | List all standing orders | /edicta |
/revoke <name> | Revoke an edictum | /revoke no-console-logs |
/acta | Show recent shared knowledge | /acta |
/history [n] | Show last N praetorium messages | /history 20 |
/reset <name> | Reset a centurio's session | /reset vorenus |
/remove <name> | Remove a centurio (requires confirmation) | /remove vorenus |
Mention Patterns (keep current — already well-designed)
@vorenus do this → single centurio
@vorenus @pullo do this → parallel dispatch
Just do this → Legatus routes (or handles itself)Trade-off: Flat vs. Hierarchical Commands
Recommendation: stay flat. Telegram's command autocomplete only shows top-level commands. Subcommands (e.g., /centurio create, /centurio remove) won't appear in suggestions and feel unnatural in chat. Flat commands (/create, /remove) are faster to type and align with the conversational paradigm.
2. Agent Identity & Status Signaling
The Constraint
Telegram bots cannot change their display name or avatar per message. A single bot token = one identity. To simulate multiple centuriones speaking through one bot, identity must be conveyed in-message.
Current Pattern (⚔️ attribution header)
⚔️ vorenus
Your response here...Proposed Enhancement: Richer Headers with Role Context
⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Your response here...- Name + one-line role reminds Caesar who this centurio is
- Thin separator (Unicode
━) visually groups header from content - Legatus responses have no header (the default voice — like a CO speaking directly)
Status Icons (unified vocabulary)
| Icon | Status | Meaning |
|---|---|---|
| ✅ | idle | Ready for orders |
| ⏳ | working | Processing a task |
| ❌ | error | Last task failed |
| 💤 | disconnected | Session reaped (idle timeout) |
| 🆕 | new | Just created, never used |
Live Status Updates (already implemented)
The _make_status_callback pattern is good. Proposed refinements to format_status():
⏳ Legatus is thinking...
⏳ vorenus is using search_web...
⏳ vorenus is writing to commentarii...
⏳ Dispatching to pullo...Use natural English ("is using...") instead of bare tool names — Caesar doesn't care about internal tool identifiers.
3. Inter-Agent Communication Protocols
Current Architecture
Caesar ──→ Legatus ──→ Centurio(nes)
↕
Praetorium (message bus)All communication is mediated by Legatus — centuriones cannot talk to each other directly. This is correct and should stay.
Communication Paths
| Path | Mechanism | Caesar Sees |
|---|---|---|
| Caesar → Legatus | Free-form text | — |
| Caesar → Centurio | @mention in text | — |
| Legatus → Centurio | dispatch_to_centurio tool | Live status update |
| Centurio → Caesar | Response via Legatus | Attributed message |
| Centurio → Centurio | Via acta (shared knowledge) | Nothing (async, pull-based) |
| Legatus → Caesar | Direct response | Unheadered message |
Proposed: Visible Routing Narration
When Legatus auto-routes (no @mention), show a brief narration before dispatching:
Caesar: Research the latest Claude SDK changes
Legatus: Routing to @vorenus (Code Specialist) — this is a technical research task.
⏳ vorenus is working...
⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Here are the recent Claude SDK changes...This helps Caesar understand why Legatus chose a particular centurio. Implemented purely via the Legatus prompt — no code change needed.
Proposed: Cross-Centurio Handoff Pattern
Add to Legatus prompt: when one centurio's response suggests work for another, Legatus can chain dispatches. Caesar sees each step:
⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Found 3 bugs in the auth module. Recommending pullo review the test coverage.
⏳ Legatus is dispatching to pullo...
⚔️ pullo — Test Engineer
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Reviewed auth module tests. Added 5 missing test cases. Coverage now 100%.This is achievable entirely through the Legatus prompt: instruct it to read centurio responses and decide whether to chain another dispatch.
4. Authorization Workflows (OTP via Prompt)
Design: Prompt-Based Confirmation Protocol
Following the "prompt over code" philosophy, dangerous actions are gated by a confirmation protocol defined entirely in the Legatus system prompt.
Dangerous Action Categories
| Action | Risk Level | Confirmation Required |
|---|---|---|
remove_centurio | 🔴 High | Full confirmation |
revoke_edictum | 🟡 Medium | Full confirmation |
create_centurio | 🟢 Low | None (reversible) |
publish_edictum | 🟢 Low | None (revocable) |
dispatch_to_centurio | 🟢 Low | None (read-only task) |
Confirmation Protocol (added to Legatus prompt)
## Dangerous Actions — Require Caesar's Confirmation
The following actions are DESTRUCTIVE and IRREVERSIBLE:
- Removing a centurio (`remove_centurio`) — deletes all centurio data
- Revoking an edictum (`revoke_edictum`) — removes a standing order
Before executing any destructive action:
1. Explain clearly WHAT will be lost and WHY it cannot be undone.
2. Show the exact action: "I will remove centurio: vorenus"
3. Ask Caesar to confirm by replying: "Confirmed"
4. ONLY call the tool AFTER receiving "Confirmed" as a direct reply.
5. If Caesar says anything else, ABORT and acknowledge the cancellation.
Never combine a destructive action with other actions in the same turn.Example Interaction
Caesar: Remove vorenus, he's not useful anymore
Legatus: I will remove centurio vorenus. This will:
• Delete his prompt, tools config, and commentarii
• Disconnect his active session
• This cannot be undone.
Reply "Confirmed" to proceed.
Caesar: Confirmed
Legatus: ✅ Removed centurio: vorenusTrade-off: Prompt vs. Code Enforcement
| Approach | Pros | Cons |
|---|---|---|
| Prompt-only (recommended) | Zero code changes, flexible, easy to adjust rules | LLM could skip in edge cases; no hard guarantee |
| Code-level OTP | Hard guarantee, tamper-proof | More code, more tests, breaks "prompt over code" philosophy |
| Hybrid | Best of both | Over-engineered for single-user system |
Recommendation: Prompt-only for now. Caesar is the sole user — prompt adherence is sufficient. If the legion grows to serve multiple users, revisit with code-level enforcement.
5. Message Threading & Context Preservation
Telegram Constraint
Private chats are flat — no native threading. But reply_to_message_id creates visual context chains.
Proposed: Reply-to-Original Threading
When a centurio responds, reply to Caesar's original message:
Caesar: @vorenus check the auth module ← message_id: 42
└── ⚔️ vorenus — Code Specialist ← reply_to: 42
Found 3 issues in auth...This creates a visual thread in Telegram. Caesar can tap the reply header to jump back to the original request.
Proposed: Context Summary for Long Conversations
When a session resets (token limit), the centurio loses context. Add to centurio prompt:
When you notice a conversation context reset, briefly acknowledge it:
"(Session refreshed — I've lost our earlier conversation.
Check my commentarii for any notes I saved.)"Proposed: Pin Important Results
For significant outputs (research reports, audit results), Legatus can pin the message:
Caesar: @vorenus comprehensive audit of the codebase
⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📌 Audit Report: Legio Codebase
[... detailed findings ...]
(Legatus pins this message)This requires the pin_chat_message API call — a small code addition in bot.py.
6. Scalability Considerations
Current Limits
| Constraint | Current | Ceiling |
|---|---|---|
| Max centuriones | 10 (config) | Practical: ~20-30 before UX degrades |
| Parallel dispatch | Unlimited (asyncio.gather) | Telegram rate limits: ~30 msg/sec global |
| Session timeout | 30 min idle | Adjustable via config |
| Token budget | 200k per centurio | SDK limit; auto-reset at 150k |
UX Degradation Points
5-10 centuriones: Current UX works fine. /status fits on one screen. @mentions are manageable.
10-20 centuriones: /status gets long. Need grouping:
📋 Centurio Status
Code & Engineering
✅ vorenus — Code Specialist
⏳ pullo — Test Engineer
Research & Analysis
✅ cicero — Research Analyst
💤 brutus — Data Analyst
Operations
✅ octavia — DevOpsThis is a prompt-only change — instruct Legatus to categorize centuriones when displaying status.
20+ centuriones: Inline keyboards become valuable:
📋 Select a centurio:
[vorenus] [pullo] [cicero]
[brutus] [octavia] [marcus]This requires code (inline keyboard in bot.py) but scales better than typed @mentions.
Response Time Budget
| Operation | Current | Target |
|---|---|---|
| First ⏳ acknowledgment | <100ms | <100ms ✅ |
| First status update | ~3s (debounce) | ~2s |
| Single centurio response | 5-30s | 5-30s (SDK dependent) |
| Parallel dispatch (3 centuriones) | 5-30s (slowest wins) | Same |
| Full round-trip with Legatus routing | 10-60s | 10-60s (two SDK calls) |
Proposed: Timeout Expectations
Add to Legatus prompt:
If a task will take significant processing time (>30 seconds),
tell Caesar upfront: "This may take a minute — I'm dispatching to vorenus."7. Concrete Interaction Examples
Example 1: Simple Direct Message (no routing)
Caesar: What centuriones do I have?
Legatus: You currently have 3 centuriones:
• vorenus — Code Specialist (✅ idle)
• pullo — Test Engineer (✅ idle)
• cicero — Research Analyst (💤 disconnected)Example 2: Single Mention
Caesar: @vorenus refactor the auth module to use async
⏳ vorenus is reading edicta...
⏳ vorenus is analyzing legio/auth.py...
⏳ vorenus is writing code...
⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Refactored auth module to async. Changes:
• Converted 4 functions to async/await
• Updated 12 call sites
• All tests passing
Wrote details to commentarii for reference.Example 3: Multi-Mention Parallel
Caesar: @vorenus @pullo audit the memoria module — code quality and test coverage
⏳ [vorenus] Reading edicta...
⏳ [pullo] Reading edicta...
⏳ [vorenus] Analyzing memoria/store.py...
⏳ [pullo] Running pytest --cov...
⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Code quality: 8/10. Found 2 minor issues...
⚔️ pullo — Test Engineer
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Test coverage: 100%. 94 tests, all passing...Example 4: Auto-Routing by Legatus
Caesar: I need a deep analysis of our competitor's API documentation
⏳ Legatus is thinking...
Legatus: Routing to @cicero (Research Analyst) — this is a research task.
⏳ cicero is searching the web...
⏳ cicero is writing to commentarii...
⚔️ cicero — Research Analyst
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Here's my analysis of the competitor API...Example 5: Dangerous Action with Confirmation
Caesar: Remove cicero
Legatus: I will remove centurio cicero. This will:
• Delete prompt.md, tools.json, and all commentarii
• Disconnect the active session
• This action cannot be undone.
Reply "Confirmed" to proceed.
Caesar: Confirmed
Legatus: ✅ Removed centurio: ciceroExample 6: Error Recovery
Caesar: @vorenus deploy to production
⏳ vorenus is working...
⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
❌ Deployment failed: 3 tests are failing in test_auth.py.
I've written the error details to my commentarii.
Recommend fixing the tests before deploying.8. Implementation Priority
All items are pure prompt changes unless marked otherwise.
| Priority | Item | Type | Effort |
|---|---|---|---|
| 1 | Dangerous action confirmation protocol | Prompt | 10 min |
| 2 | Richer attribution headers (name + role + separator) | Code (rendering) | 30 min |
| 3 | Routing narration for auto-dispatch | Prompt | 10 min |
| 4 | Humanized status messages ("is using..." not tool names) | Code (format_status) | 20 min |
| 5 | Reply-to-original threading | Code (bot.py) | 30 min |
| 6 | New commands (/edict, /edicta, /history, etc.) | Code (bot.py) | 2 hr |
| 7 | Centurio grouping in /status | Prompt | 10 min |
| 8 | Pin important results | Code (bot.py) | 30 min |
| 9 | Inline keyboard for centurio selection | Code (bot.py) | 2 hr |
9. Summary of Trade-offs
| Decision | Choice | Rationale |
|---|---|---|
| Flat vs. hierarchical commands | Flat | Telegram autocomplete; conversational UX |
| Per-agent bot vs. single bot | Single bot | Simpler ops; identity via headers |
| Code vs. prompt OTP | Prompt | "Prompt over code"; single-user system |
| Inline keyboards vs. @mentions | @mentions now, keyboards later | @mentions work at current scale |
| Forum topics vs. flat chat | Flat chat | Simpler; reply_to provides enough context |
| Streaming edits vs. batch | Batch (3s debounce) | Telegram rate limits; already implemented |