--- title: Conversational System url: https://blog.guigpap.com/en/workflows/systeme-conversationnel/ url_md: https://blog.guigpap.com/en/workflows/systeme-conversationnel.md category: tooling date: '2026-03-28' maturite: production techno: - n8n - telegram - claude - odoo application: - ai - knowledge - operations --- # Conversational System > Multi-turn AI agent with action plans, MCP tools and persistent memory inside Telegram ## 1. What? — Definition and context Beyond simple commands (`/docker status`, `/n8n workflows`), some requests require a real conversation: "find the Dupont contact in Odoo, then show me his unpaid invoices, and create a follow-up project". This requires context, memory between messages, and the ability to execute multi-step actions. The **Conversational System** turns the Telegram bot into an AI assistant capable of holding multi-turn conversations, calling tools (Docker, Odoo, N8N, web search), executing multi-step plans, and dynamically selecting MCP tools — all with persistent memory through Redis. > **Note - Multi-turn conversation** > > A **multi-turn conversation** means the AI remembers what you said previously. If you ask "find Jean", then "show his invoices", the AI knows that "his" refers to Jean thanks to the session context. ### The three layers of the system | Layer | Components | Role | |-------|------------|------| | **Conversations** | Agent (20n), Manager (20n), Command Handlers (35n), Callback Handler (94n), Summarizer (12n) | Conversation lifecycle, routing, memory | | **Tools** | Tool Router (10n), Web Search (6n), Lead Intelligence (11n) | Action execution and enrichment | | **Plans & MCP** | Plan Engine (55n), MCP Menu, MCP Confirmation (3n) | Multi-step actions and advanced tools | ### Overall architecture ```mermaid flowchart TD Msg["Text message · active conversation"] subgraph Agent["Conversation Agent · 20 nodes"] direction TB Prompt["Build System Prompt"] First["Call Claude · 1st pass"] Detect{"Response type"} Second["Claude 2nd pass · summarise"] Send["Telegram send"] end Text["Plain text"] Tool["tool block · Tool Router 10n"] Plan["plan block · Plan Engine 55n"] MCP["mcp block · cli-ollama"] Msg --> Prompt --> First --> Detect Detect --> Text --> Send Detect --> Tool --> Second Detect --> Plan --> Second Detect --> MCP --> Second Second --> Send ``` --- ## 2. Why? — Stakes and motivations Before the conversational system, every interaction with the bot was atomic. You sent a command, you received a reply, and the bot forgot everything. For complex tasks, you had to chain commands manually by copying the output of one step into the next. ### Problems solved | Problem | Without conversations | With conversations | |---------|----------------------|--------------------| | **No memory** | Bot forgets between messages | Persistent Redis session | | **Atomic actions** | One command = one action | Automatic multi-step plans | | **No context** | "show his invoices" → whose? | The AI keeps the thread of the discussion | | **Limited tools** | Fixed commands /docker, /n8n | Natural language + dynamic tool calls | | **No search** | No web access | Gemini grounding + lead intelligence | ### Architecture choices Why a 2-pass LLM flow rather than a single call? | Approach | Advantage | Drawback | |----------|-----------|----------| | **1 pass** | Faster | The AI can only talk, not act | | **2 passes** | Tool detection → execution → summary | 2x LLM latency on tool calls | The 2nd pass is essential: it lets Claude reformulate technical results (Docker JSON, Odoo XML-RPC) into a readable human reply. > **Tip - Conversation templates** > > Five templates are available to specialise the behaviour: `crm` (sales focus), `docker` (infra focus), `code` (dev focus), `content` (content focus), `brainstorm` (creative mode). Each template tweaks the system prompt and the default LLM model. --- ## 3. How? — Technical implementation ### Lifecycle management Seven commands manage conversations: | Command | Action | |---------|--------| | `/new` | Creates a conversation (archives the previous one if active) | | `/conv` | Lists conversations with pagination | | `/endconv` | Archives the active conversation | | `/model` | Changes the LLM model (Codex Max/Standard/Mini, Gemini Pro/Flash) | | `/plan` | Shows the status of the current plan | | `/templates` | Lists the 5 available templates | | `/mcp` | Configures MCP tools for this conversation | Two Data Tables manage the state: - **conversations** — Title, model, template, status, turn counter, MCP config, associated plan - **active_conversations** — One row per user, points to the current conversation When a text message arrives and a conversation is active, the Orchestrator short-circuits the normal routing (AI Router, Command Router) and sends the message directly to the Conversation Agent. ### A message's journey When you write "what are the active projects in Odoo?" in an active conversation: **1. Interception** — The Orchestrator detects the active conversation and routes to the Conversation Agent. **2. Retry check** — The system checks whether a plan is awaiting correction (escalation). If so, the message is treated as a correction, not as a new message. **3. Prompt build** — The system prompt is built according to the active template, with descriptions of available tools and MCP instructions if active. **4. First Claude call** — The LLM receives the message with the full session context (via Redis). It can answer in plain text, or generate a tool block: | Block | Routing | |-------|---------| | ` ```tool ` | Docker, Odoo, N8N, web search call | | ` ```plan ` | Multi-step plan creation | | ` ```mcp ` | Direct N8N MCP tool call | **5. Execution** — Depending on the detected type, the system routes to the Tool Router, the Plan Engine, or the MCP gateway. **6. Second Claude call** — The raw results are reinjected into the context, and Claude generates a readable response. A restrictive prompt prevents new tool blocks at this stage. **7. Send** — The reply is sent to Telegram (truncated to 4000 characters if needed). **8. Compression** — If the turn counter exceeds 40, the Conversation Summarizer automatically compresses history: old messages are summarised in one paragraph, and the last 10 messages are kept intact. ### The Tool Router When Claude generates a `tool` block, the Conversation Tool Router dispatches to the right service: | Service | Handler | Sample actions | |---------|---------|----------------| | `docker` | Service Handler Docker | status, restart, logs, update | | `odoo` | Service Handler Odoo | search_contact, search_invoice, list_projects | | `n8n` | Service Handler N8N | list_workflows, list_executions, toggle | | `web_search` | Conversation Web Search | Gemini search with sources | | `lead_intelligence` | Conversation Lead Intelligence | Contact enrichment (Odoo + web) | The response is normalised with multi-signal failure detection (`error` field, `success === false`, HTTP status >= 400, empty text). ### The Plan Engine The Plan Engine handles requests requiring multiple coordinated steps. When Claude detects that a task is too complex for a single tool call, it generates a plan: ```text User: "Check the status of all Docker stacks, and restart the ones that are down" Claude generates a plan: Step 1: docker status (all stacks) Step 2: analyse results → identify the down ones Step 3: docker restart (down stacks) The plan is presented with buttons: [Execute] [Modify] [Cancel] ``` Each step can reference earlier results through variables (`{{step_0_result}}`). If a step fails, the system escalates with three options: | Action | Behaviour | |--------|-----------| | **[Skip]** | Skip the step + transitive cascade (cancels dependencies) | | **[Modify]** | Asks for a textual correction, then resumes | | **[Cancel]** | Cancels all remaining steps | > **Caution - Contradiction detection** > > If a step fails and later steps depend on its result, the Plan Engine detects the contradiction automatically and escalates before executing code with missing data. ### The MCP system The `/mcp` command opens a menu allowing you to enable or disable MCP (Model Context Protocol) tools for the current conversation. Twenty N8N tools are available, split into two categories: | Category | Count | Behaviour | |----------|-------|-----------| | **Read-only** | 12 | Direct execution (list_workflows, get_workflow, search_nodes...) | | **Write** | 8 | Mandatory Telegram confirmation before execution | The menu shows each server with its ON/OFF status and the count of active tools. Tools can be enabled/disabled at the granularity of an individual tool or in bulk per server. Four security layers protect operations: 1. **User whitelist** — Only tools explicitly enabled in the MCP config are accessible 2. **mcp_enabled flag** — When disabled, the CLI runs without any MCP server 3. **allowed_tools filter** — The cli-ollama gateway blocks unauthorised calls (403) 4. **Telegram confirmation** — Write tools trigger a confirmation workflow with [Approve] / [Reject] buttons and a 90-second timeout > **Danger - Confirmation timeout** > > If nobody answers MCP confirmation buttons within 90 seconds, the operation is automatically rejected. This is a safety net against unmonitored MCP calls. ### Memory and compression Conversation memory is managed by Redis through the `memory_service` of CLI Ollama. Each conversation has a unique `session_id` that serves as the Redis key. When the turn count exceeds 40 (~20 exchanges), the Conversation Summarizer kicks in: 1. Fetches all session messages via the REST API 2. Separates the last 10 messages (to keep intact) 3. Summarises the older messages into one paragraph (Claude, 200 words max) 4. Empties the Redis session and reinjects the summary + recent messages The user sees nothing — the compression is transparent and allows conversations that last hours without quality degradation. --- ## 4. What if? — Outlook and limits ### Current limits | Limit | Impact | Mitigation | |-------|--------|------------| | **2-pass latency** | 5-10s per message with tool | Acceptable for personal usage | | **No streaming** | Reply arrives in one block | "typing..." indicator during processing | | **Linear plans** | No parallel steps | Sufficient for current use cases | | **Compression at 40 turns** | Loss of older details | Faithful summary, last 10 messages intact | ### Evolution scenarios **If multi-user conversations are needed**: - Shared conversations with permissions - Action history visible to the team - Plan assignment to specific members **If more tools are needed**: - Add a service to the Tool Router (a new case in the Switch) - Or enable an additional MCP server - The system is extensible by design **If more complex plans are needed**: - Add support for parallel steps (concurrent execution) - Allow nested sub-plans - Integrate a per-step approval system for critical actions --- ## Related pages ### Infrastructure - [AI Stack](/en/infrastructure/ai-stack/) — CLI Ollama, Redis, session memory - [N8N in queue mode](/en/infrastructure/n8n-queue-mode/) — Backend workers ### Workflows - [Telegram Orchestrator](/en/workflows/telegram-orchestrator/) — Message and callback routing - [Voice Transcription](/en/workflows/voice-transcription/) — Voice messages in conversations - [Global Error Handler](/en/workflows/error-handler/) — Workflow error handling ### Reference - [Glossary](/en/reference/glossary/) — MCP, Session, Multi-turn ## Metadonnees agent - Cet article est issu du blog GuiGPaP Lab. - Contexte global du blog: https://blog.guigpap.com/llms.txt - Contact auteur: https://odoo.guigpap.com/mon-cv - Licence: CC-BY-SA 4.0