Telegram Chat UX — Legio Command Interface

A professional UX design for managing a legion of AI agents through Telegram.

1. Command Structure & Mention Patterns

Current Commands (keep as-is)

Command	Purpose
`/status`	Show centuriones with status icons
`/list`	List centuriones with descriptions
`/create <name> <desc>`	Create a new centurio
`/help`	Show command reference

Proposed New Commands

Command	Purpose	Example
`/edict <text>`	Publish a standing order	`/edict Always write tests before code`
`/edicta`	List all standing orders	`/edicta`
`/revoke <name>`	Revoke an edictum	`/revoke no-console-logs`
`/acta`	Show recent shared knowledge	`/acta`
`/history [n]`	Show last N praetorium messages	`/history 20`
`/reset <name>`	Reset a centurio's session	`/reset vorenus`
`/remove <name>`	Remove a centurio (requires confirmation)	`/remove vorenus`

Mention Patterns (keep current — already well-designed)

@vorenus do this                 → single centurio
@vorenus @pullo do this          → parallel dispatch
Just do this                     → Legatus routes (or handles itself)

Trade-off: Flat vs. Hierarchical Commands

Recommendation: stay flat. Telegram's command autocomplete only shows top-level commands. Subcommands (e.g., /centurio create, /centurio remove) won't appear in suggestions and feel unnatural in chat. Flat commands (/create, /remove) are faster to type and align with the conversational paradigm.

2. Agent Identity & Status Signaling

The Constraint

Telegram bots cannot change their display name or avatar per message. A single bot token = one identity. To simulate multiple centuriones speaking through one bot, identity must be conveyed in-message.

Current Pattern (⚔️ attribution header)

⚔️ vorenus

Your response here...

Proposed Enhancement: Richer Headers with Role Context

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Your response here...

Name + one-line role reminds Caesar who this centurio is
Thin separator (Unicode ━) visually groups header from content
Legatus responses have no header (the default voice — like a CO speaking directly)

Status Icons (unified vocabulary)

Icon	Status	Meaning
✅	`idle`	Ready for orders
⏳	`working`	Processing a task
❌	`error`	Last task failed
💤	`disconnected`	Session reaped (idle timeout)
🆕	`new`	Just created, never used

Live Status Updates (already implemented)

The _make_status_callback pattern is good. Proposed refinements to format_status():

⏳ Legatus is thinking...
⏳ vorenus is using search_web...
⏳ vorenus is writing to commentarii...
⏳ Dispatching to pullo...

Use natural English ("is using...") instead of bare tool names — Caesar doesn't care about internal tool identifiers.

3. Inter-Agent Communication Protocols

Current Architecture

Caesar ──→ Legatus ──→ Centurio(nes)
                  ↕
              Praetorium (message bus)

All communication is mediated by Legatus — centuriones cannot talk to each other directly. This is correct and should stay.

Communication Paths

Path	Mechanism	Caesar Sees
Caesar → Legatus	Free-form text	—
Caesar → Centurio	`@mention` in text	—
Legatus → Centurio	`dispatch_to_centurio` tool	Live status update
Centurio → Caesar	Response via Legatus	Attributed message
Centurio → Centurio	Via acta (shared knowledge)	Nothing (async, pull-based)
Legatus → Caesar	Direct response	Unheadered message

Proposed: Visible Routing Narration

When Legatus auto-routes (no @mention), show a brief narration before dispatching:

Caesar: Research the latest Claude SDK changes

Legatus: Routing to @vorenus (Code Specialist) — this is a technical research task.

⏳ vorenus is working...

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Here are the recent Claude SDK changes...

This helps Caesar understand why Legatus chose a particular centurio. Implemented purely via the Legatus prompt — no code change needed.

Proposed: Cross-Centurio Handoff Pattern

Add to Legatus prompt: when one centurio's response suggests work for another, Legatus can chain dispatches. Caesar sees each step:

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Found 3 bugs in the auth module. Recommending pullo review the test coverage.

⏳ Legatus is dispatching to pullo...

⚔️ pullo — Test Engineer
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Reviewed auth module tests. Added 5 missing test cases. Coverage now 100%.

This is achievable entirely through the Legatus prompt: instruct it to read centurio responses and decide whether to chain another dispatch.

4. Authorization Workflows (OTP via Prompt)

Design: Prompt-Based Confirmation Protocol

Following the "prompt over code" philosophy, dangerous actions are gated by a confirmation protocol defined entirely in the Legatus system prompt.

Dangerous Action Categories

Action	Risk Level	Confirmation Required
`remove_centurio`	🔴 High	Full confirmation
`revoke_edictum`	🟡 Medium	Full confirmation
`create_centurio`	🟢 Low	None (reversible)
`publish_edictum`	🟢 Low	None (revocable)
`dispatch_to_centurio`	🟢 Low	None (read-only task)

Confirmation Protocol (added to Legatus prompt)

markdown

## Dangerous Actions — Require Caesar's Confirmation

The following actions are DESTRUCTIVE and IRREVERSIBLE:
- Removing a centurio (`remove_centurio`) — deletes all centurio data
- Revoking an edictum (`revoke_edictum`) — removes a standing order

Before executing any destructive action:
1. Explain clearly WHAT will be lost and WHY it cannot be undone.
2. Show the exact action: "I will remove centurio: vorenus"
3. Ask Caesar to confirm by replying: "Confirmed"
4. ONLY call the tool AFTER receiving "Confirmed" as a direct reply.
5. If Caesar says anything else, ABORT and acknowledge the cancellation.

Never combine a destructive action with other actions in the same turn.

Example Interaction

Caesar: Remove vorenus, he's not useful anymore

Legatus: I will remove centurio vorenus. This will:
  • Delete his prompt, tools config, and commentarii
  • Disconnect his active session
  • This cannot be undone.

  Reply "Confirmed" to proceed.

Caesar: Confirmed

Legatus: ✅ Removed centurio: vorenus

Trade-off: Prompt vs. Code Enforcement

Approach	Pros	Cons
Prompt-only (recommended)	Zero code changes, flexible, easy to adjust rules	LLM could skip in edge cases; no hard guarantee
Code-level OTP	Hard guarantee, tamper-proof	More code, more tests, breaks "prompt over code" philosophy
Hybrid	Best of both	Over-engineered for single-user system

Recommendation: Prompt-only for now. Caesar is the sole user — prompt adherence is sufficient. If the legion grows to serve multiple users, revisit with code-level enforcement.

5. Message Threading & Context Preservation

Telegram Constraint

Private chats are flat — no native threading. But reply_to_message_id creates visual context chains.

Proposed: Reply-to-Original Threading

When a centurio responds, reply to Caesar's original message:

Caesar: @vorenus check the auth module       ← message_id: 42
    └── ⚔️ vorenus — Code Specialist         ← reply_to: 42
        Found 3 issues in auth...

This creates a visual thread in Telegram. Caesar can tap the reply header to jump back to the original request.

Proposed: Context Summary for Long Conversations

When a session resets (token limit), the centurio loses context. Add to centurio prompt:

markdown

When you notice a conversation context reset, briefly acknowledge it:
"(Session refreshed — I've lost our earlier conversation.
 Check my commentarii for any notes I saved.)"

Proposed: Pin Important Results

For significant outputs (research reports, audit results), Legatus can pin the message:

Caesar: @vorenus comprehensive audit of the codebase

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📌 Audit Report: Legio Codebase
[... detailed findings ...]

(Legatus pins this message)

This requires the pin_chat_message API call — a small code addition in bot.py.

6. Scalability Considerations

Current Limits

Constraint	Current	Ceiling
Max centuriones	10 (config)	Practical: ~20-30 before UX degrades
Parallel dispatch	Unlimited (asyncio.gather)	Telegram rate limits: ~30 msg/sec global
Session timeout	30 min idle	Adjustable via config
Token budget	200k per centurio	SDK limit; auto-reset at 150k

UX Degradation Points

5-10 centuriones: Current UX works fine. /status fits on one screen. @mentions are manageable.

10-20 centuriones: /status gets long. Need grouping:

📋 Centurio Status

Code & Engineering
  ✅ vorenus — Code Specialist
  ⏳ pullo — Test Engineer

Research & Analysis
  ✅ cicero — Research Analyst
  💤 brutus — Data Analyst

Operations
  ✅ octavia — DevOps

This is a prompt-only change — instruct Legatus to categorize centuriones when displaying status.

20+ centuriones: Inline keyboards become valuable:

📋 Select a centurio:
[vorenus] [pullo] [cicero]
[brutus]  [octavia] [marcus]

This requires code (inline keyboard in bot.py) but scales better than typed @mentions.

Response Time Budget

Operation	Current	Target
First ⏳ acknowledgment	<100ms	<100ms ✅
First status update	~3s (debounce)	~2s
Single centurio response	5-30s	5-30s (SDK dependent)
Parallel dispatch (3 centuriones)	5-30s (slowest wins)	Same
Full round-trip with Legatus routing	10-60s	10-60s (two SDK calls)

Proposed: Timeout Expectations

Add to Legatus prompt:

markdown

If a task will take significant processing time (>30 seconds),
tell Caesar upfront: "This may take a minute — I'm dispatching to vorenus."

7. Concrete Interaction Examples

Example 1: Simple Direct Message (no routing)

Caesar: What centuriones do I have?

Legatus: You currently have 3 centuriones:
  • vorenus — Code Specialist (✅ idle)
  • pullo — Test Engineer (✅ idle)
  • cicero — Research Analyst (💤 disconnected)

Example 2: Single Mention

Caesar: @vorenus refactor the auth module to use async

⏳ vorenus is reading edicta...
⏳ vorenus is analyzing legio/auth.py...
⏳ vorenus is writing code...

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Refactored auth module to async. Changes:
• Converted 4 functions to async/await
• Updated 12 call sites
• All tests passing

Wrote details to commentarii for reference.

Example 3: Multi-Mention Parallel

Caesar: @vorenus @pullo audit the memoria module — code quality and test coverage

⏳ [vorenus] Reading edicta...
⏳ [pullo] Reading edicta...
⏳ [vorenus] Analyzing memoria/store.py...
⏳ [pullo] Running pytest --cov...

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Code quality: 8/10. Found 2 minor issues...

⚔️ pullo — Test Engineer
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Test coverage: 100%. 94 tests, all passing...

Example 4: Auto-Routing by Legatus

Caesar: I need a deep analysis of our competitor's API documentation

⏳ Legatus is thinking...

Legatus: Routing to @cicero (Research Analyst) — this is a research task.

⏳ cicero is searching the web...
⏳ cicero is writing to commentarii...

⚔️ cicero — Research Analyst
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Here's my analysis of the competitor API...

Example 5: Dangerous Action with Confirmation

Caesar: Remove cicero

Legatus: I will remove centurio cicero. This will:
  • Delete prompt.md, tools.json, and all commentarii
  • Disconnect the active session
  • This action cannot be undone.

  Reply "Confirmed" to proceed.

Caesar: Confirmed

Legatus: ✅ Removed centurio: cicero

Example 6: Error Recovery

Caesar: @vorenus deploy to production

⏳ vorenus is working...

⚔️ vorenus — Code Specialist
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
❌ Deployment failed: 3 tests are failing in test_auth.py.
I've written the error details to my commentarii.
Recommend fixing the tests before deploying.

8. Implementation Priority

All items are pure prompt changes unless marked otherwise.

Priority	Item	Type	Effort
1	Dangerous action confirmation protocol	Prompt	10 min
2	Richer attribution headers (name + role + separator)	Code (rendering)	30 min
3	Routing narration for auto-dispatch	Prompt	10 min
4	Humanized status messages ("is using..." not tool names)	Code (format_status)	20 min
5	Reply-to-original threading	Code (bot.py)	30 min
6	New commands (/edict, /edicta, /history, etc.)	Code (bot.py)	2 hr
7	Centurio grouping in /status	Prompt	10 min
8	Pin important results	Code (bot.py)	30 min
9	Inline keyboard for centurio selection	Code (bot.py)	2 hr

9. Summary of Trade-offs

Decision	Choice	Rationale
Flat vs. hierarchical commands	Flat	Telegram autocomplete; conversational UX
Per-agent bot vs. single bot	Single bot	Simpler ops; identity via headers
Code vs. prompt OTP	Prompt	"Prompt over code"; single-user system
Inline keyboards vs. @mentions	@mentions now, keyboards later	@mentions work at current scale
Forum topics vs. flat chat	Flat chat	Simpler; reply_to provides enough context
Streaming edits vs. batch	Batch (3s debounce)	Telegram rate limits; already implemented

Telegram Chat UX — Legio Command Interface ​

1. Command Structure & Mention Patterns ​

Current Commands (keep as-is) ​

Proposed New Commands ​

Mention Patterns (keep current — already well-designed) ​

Trade-off: Flat vs. Hierarchical Commands ​

2. Agent Identity & Status Signaling ​

The Constraint ​

Current Pattern (⚔️ attribution header) ​

Proposed Enhancement: Richer Headers with Role Context ​

Status Icons (unified vocabulary) ​

Live Status Updates (already implemented) ​

3. Inter-Agent Communication Protocols ​

Current Architecture ​

Communication Paths ​

Proposed: Visible Routing Narration ​

Proposed: Cross-Centurio Handoff Pattern ​

4. Authorization Workflows (OTP via Prompt) ​

Design: Prompt-Based Confirmation Protocol ​

Dangerous Action Categories ​

Confirmation Protocol (added to Legatus prompt) ​

Example Interaction ​

Trade-off: Prompt vs. Code Enforcement ​

5. Message Threading & Context Preservation ​

Telegram Constraint ​

Proposed: Reply-to-Original Threading ​

Proposed: Context Summary for Long Conversations ​

Proposed: Pin Important Results ​

6. Scalability Considerations ​

Current Limits ​

UX Degradation Points ​

Response Time Budget ​

Proposed: Timeout Expectations ​

7. Concrete Interaction Examples ​

Example 1: Simple Direct Message (no routing) ​

Example 2: Single Mention ​

Example 3: Multi-Mention Parallel ​

Example 4: Auto-Routing by Legatus ​

Example 5: Dangerous Action with Confirmation ​

Example 6: Error Recovery ​

8. Implementation Priority ​

9. Summary of Trade-offs ​

Telegram Chat UX — Legio Command Interface

1. Command Structure & Mention Patterns

Current Commands (keep as-is)

Proposed New Commands

Mention Patterns (keep current — already well-designed)

Trade-off: Flat vs. Hierarchical Commands

2. Agent Identity & Status Signaling

The Constraint

Current Pattern (⚔️ attribution header)

Proposed Enhancement: Richer Headers with Role Context

Status Icons (unified vocabulary)

Live Status Updates (already implemented)

3. Inter-Agent Communication Protocols

Current Architecture

Communication Paths

Proposed: Visible Routing Narration

Proposed: Cross-Centurio Handoff Pattern

4. Authorization Workflows (OTP via Prompt)

Design: Prompt-Based Confirmation Protocol

Dangerous Action Categories

Confirmation Protocol (added to Legatus prompt)

Example Interaction

Trade-off: Prompt vs. Code Enforcement

5. Message Threading & Context Preservation

Telegram Constraint

Proposed: Reply-to-Original Threading

Proposed: Context Summary for Long Conversations

Proposed: Pin Important Results

6. Scalability Considerations

Current Limits

UX Degradation Points

Response Time Budget

Proposed: Timeout Expectations

7. Concrete Interaction Examples

Example 1: Simple Direct Message (no routing)

Example 2: Single Mention

Example 3: Multi-Mention Parallel

Example 4: Auto-Routing by Legatus

Example 5: Dangerous Action with Confirmation

Example 6: Error Recovery

8. Implementation Priority

9. Summary of Trade-offs