Telegram Interactive Mode — Feasibility & Rollout Plan

Date: 2026-02-15 Status: Proposal Author: Antony

Evaluate which Claude Code interactive-mode features can be reliably implemented in a single Telegram chat session, with phased rollout and fallbacks for Telegram's limitations.

Feature-by-Feature Viability

1. File I/O (Read / Write / Edit) — HIGH

Centuriones already access the filesystem via castra/. The SDK client runs server-side with full fs access; Caesar sees results as Telegram messages.

Aspect	Assessment
Read	Centurio tools already read `commentarii`, `acta`, `edicta`. Extending to arbitrary paths is trivial.
Write	Same — filesystem writes are server-side, no Telegram API involvement.
Edit	Doable via SDK tool definitions with old/new string replacement.
Challenge	Telegram's 4096-char message limit truncates large file contents.
Fallback	Large reads → send as Telegram document upload, or paginate with inline keyboard (prev/next).

2. Bash Execution — MEDIUM-HIGH

Aspect	Assessment
Execution	Adding a `run_command` tool to centurio tool definitions is straightforward.
Output	Stdout/stderr captured server-side, sent as Telegram message.
Challenge 1	Long-running commands block the agent turn. Need async execution + timeout.
Challenge 2	Shell injection risk. Must whitelist or sandbox.
Challenge 3	Telegram shows "bot not responding" after ~30s. Status edits (already implemented) mitigate this.
Fallback	Background execution with polling: run async, post result when ready, keep typing indicator alive.

3. Glob/Grep Search — HIGH

Aspect	Assessment
Implementation	Define `search_files(pattern)` and `search_content(regex, glob)` as centurio tools. Pure server-side.
Output	Return file list or matched lines, truncated to Telegram limits.
Challenge	Large result sets need `head_limit` equivalent, paginated via reply buttons.
Existing pattern	SDK already supports tool definitions with structured params — maps directly.

4. MCP Tools — HIGH (already the architecture)

Aspect	Assessment
Current state	Centuriones already consume MCP tools (memoria). Legatus has 6 built-in tools.
Extension	External MCP servers (web search, code analysis) are config-driven via SDK's `mcp_servers` param.
Challenge	Each MCP server is a subprocess — resource cost scales linearly.
Fallback	Lazy initialization: spawn MCP servers only when centurio actually calls a tool. Already doing this with session manager.

5. Plan Mode — MEDIUM (needs UX adaptation)

Aspect	Assessment
Concept	Caesar describes task → agent proposes plan → Caesar approves → agent executes.
Telegram mapping	Plan = formatted message with numbered steps. Approval = inline keyboard `[Approve] [Revise] [Cancel]`.
Challenge 1	No native "plan file". Serialize as structured message or store in `castra/plans/`.
Challenge 2	Multi-turn refinement requires state tracking. `Auctoritas` pattern (pending → approved → executed) maps well.
Challenge 3	Plans may exceed 4096 chars. Need multi-message or document fallback.
Implementation	Reuse `Auctoritas` state machine: `PlanRequest(pending) → approve → execute steps with status updates`.

6. Subagent Delegation — HIGH (this IS the architecture)

Aspect	Assessment
Current state	Legatus already delegates to centuriones — the core dispatch pattern.
Parallel	`asyncio.gather()` on multiple `dispatch_to_centurio()` calls. Async infrastructure exists.
Challenge	Multiple concurrent centuriones = multiple SDK sessions = token cost. Token tracker handles per-session budgets.
Enhancement	Add `spawn_task` tool for ephemeral sub-centuriones (one-off research, no persistence).

Cross-Cutting Constraints

API Rate Limits

Limit	Impact	Mitigation
Telegram message edits: ~30/min per chat	Status updates	Already debounced to 3s intervals
Telegram message sends: ~30/s	Chunked responses	Add 100ms delay between chunks
Anthropic API: varies by tier	Concurrent centuriones	Queue dispatches, retry with backoff (SDK handles)
MCP server calls: provider-dependent	External tools	Per-tool retry config in `tools.json`

Context Window

Constraint	Current	Gap
200k token window	Token tracker resets at 150k	Sufficient
History injection	Last 50 nuntii as XML per dispatch	May grow large with verbose outputs
Multi-centurio state	Independent sessions	No cross-centurio sharing beyond praetorium
Plan + execution context	Not implemented	Plans could consume significant context if inlined

State Persistence

State Type	Current	Interactive Mode Need
Session state	In-memory, lost on restart	Sufficient for single-user
Message history	SQLite (praetorium)	Already durable
Plans	Not implemented	Store in `castra/plans/{id}.md`
Running tasks	Not tracked	Need in-memory task registry
File edit history	Not tracked	Git integration or simple changelog

Session Timeouts

Timeout	Value	Impact
Telegram webhook	60s to respond	Long tasks must use async pattern
Centurio idle	30 min (configurable)	Sessions reaped, re-created on next dispatch
TOTP validity	120s TTL	Tight window, appropriate for security
Anthropic SDK	Configurable per-call	Long tool chains may need extended timeout

What Cannot Be Reliably Replicated

Claude Code Feature	Why
Interactive file editing with line numbers	No code editor in Telegram. Best: show diff → confirm → apply.
Real-time streaming output	Edit-message simulates streaming but limited to ~30 edits/min and 4096 chars.
Multi-file simultaneous view	No split-pane. Must serialize as sequential messages.
Plan mode file editing	Caesar can't edit a plan file. Must use message-based approval/revision.
Notebook editing	No Jupyter equivalent. Could render cell outputs as messages.
Browser automation	No equivalent. Would need headless browser MCP server.

Phased Rollout

Phase 0 — Foundation (COMPLETE)

Already implemented in Phase 1 Telegram UX:

Telegram bot with commands, TOTP, threading, status updates
Centurio dispatch with session management
Memoria (edicta/acta/commentarii) for persistent knowledge

Phase 1 — File & Search Tools

Effort: Low | Risk: Low | Trigger: Immediate

Add to centurio tool definitions:

read_file(path) — read with 4096-char truncation, option to send as document
write_file(path, content) — write to castra/-scoped paths only
search_files(glob_pattern) — return matching paths
search_content(pattern, glob) — grep equivalent, matches with line numbers

Security: Path traversal prevention — all paths resolved relative to castra/, reject .. components.

Fallback: Large outputs → send as Telegram document attachment.

Phase 2 — Command Execution

Effort: Medium | Risk: Medium | Trigger: After Phase 1 proves the pattern

run_command(cmd, timeout) — sandboxed execution in project directory
Whitelist approach: allowed commands configurable in legio.toml
Async execution: commands > 10s run in background, result posted when done
Typing indicator kept alive during execution (already implemented)

Security: Destructive commands (rm, git push, etc.) require TOTP confirmation via existing Auctoritas flow.

Fallback: Timeout → cancel + report partial output. Blocked by whitelist → TOTP escalation.

Phase 3 — Plan Mode

Effort: Medium | Risk: Low | Trigger: Caesar requests structured task execution

New Legatus tool: propose_plan(title, steps: list[str])
Rendered as numbered message with inline keyboard: [Approve] [Revise] [Cancel]
Plan stored in castra/plans/{timestamp}-{slug}.md
On approval: execute steps sequentially, update status per-step
Reuse Auctoritas state machine pattern

Fallback: Inline keyboards unavailable → text confirmation ("Reply APPROVE to proceed").

Phase 4 — Parallel Subagents

Effort: Low (infrastructure exists) | Risk: Medium (cost) | Trigger: Caesar needs multi-agent research tasks

New Legatus tool: spawn_task(description, agent_type) — ephemeral centurio
Parallel execution via asyncio.gather()
Results aggregated and posted as threaded replies
Token budget enforcement per-task

Fallback: Sequential execution if concurrent sessions exceed configured limit.

Phase 5 — External MCP Integration

Effort: Medium | Risk: Medium | Trigger: Centuriones need external data sources

Web search, code analysis, documentation lookup via MCP servers
Configured per-centurio in tools.json
Lazy server initialization (only when tool is called)
Rate limiting per MCP provider

Fallback: MCP server down → centurio reports unavailability, suggests alternative approach.

Recommendation

Start with Phase 1 (File & Search) — highest value-to-effort ratio. Centuriones that can read, write, and search the codebase become useful for code review, documentation, and debugging. The infrastructure (SDK tools, session management, message formatting) is already in place. Main work is tool schemas and output formatting for Telegram's message limits.

Phase 2 (Bash) unlocks autonomous execution but needs the security whitelist. Phase 3 (Plan Mode) is valuable but less urgent since prompt-level confirmation exists. Phases 4–5 are incremental on existing architecture.

Telegram Interactive Mode — Feasibility & Rollout Plan ​

Feature-by-Feature Viability ​

1. File I/O (Read / Write / Edit) — HIGH ​

2. Bash Execution — MEDIUM-HIGH ​

3. Glob/Grep Search — HIGH ​

4. MCP Tools — HIGH (already the architecture) ​

5. Plan Mode — MEDIUM (needs UX adaptation) ​

6. Subagent Delegation — HIGH (this IS the architecture) ​

Cross-Cutting Constraints ​

API Rate Limits ​

Context Window ​

State Persistence ​

Session Timeouts ​

What Cannot Be Reliably Replicated ​

Phased Rollout ​

Phase 0 — Foundation (COMPLETE) ​

Phase 1 — File & Search Tools ​

Phase 2 — Command Execution ​

Phase 3 — Plan Mode ​

Phase 4 — Parallel Subagents ​

Phase 5 — External MCP Integration ​

Recommendation ​