---
title: Question Hub & Vision
url: https://blog.guigpap.com/en/workflows/question-hub/
url_md: https://blog.guigpap.com/en/workflows/question-hub.md
category: tooling
date: '2026-03-28'
maturite: production
techno:
  - n8n
  - telegram
  - claude
application:
  - ai
  - knowledge
  - operations
---

# Question Hub & Vision

> Interactive Telegram interface for Claude Code questions and smart extraction from photographed documents

## 1. What? — Definition and context

When Claude Code (or Codex, or Gemini) needs to ask a multiple-choice question — "Which framework should we use? React, Vue, or Svelte?" — it cannot do so ergonomically inside a terminal. The **Question Hub** displays that question in Telegram with interactive buttons, supports multi-select and pagination, and returns the answer to the CLI.

In addition, the **Vision OCR** system analyses photos sent on Telegram: business card, invoice, screenshot, handwritten note — each document type goes through a specialised extraction.

> **Note - Interactive questions**
>
> When an AI CLI asks a question, it sends a webhook to N8N with the options. N8N renders a Telegram message with inline buttons. The user clicks, and the answer is sent back to the CLI which keeps working.

### Two complementary systems

| System | Nodes | Trigger | Role |
|--------|-------|---------|------|
| **Question Hub** | ~35 (parent + callback) | CLI webhook | Interactive Telegram questions |
| **Vision OCR** | 14 | Sub-workflow (photo) | Document extraction |

---

## 2. Why? — Stakes and motivations

### Problems solved

| Problem | Without these workflows | With these workflows |
|---------|-------------------------|----------------------|
| **Unreadable CLI questions** | Numbered list inside the terminal | Telegram buttons with emojis |
| **No multi-select** | Type the numbers one by one | ✅ toggle and confirmation |
| **Useless photos** | Photo = binary file, no info | Type-specific structured extraction |
| **Generic OCR** | Same processing for everything | Invoice ≠ business card ≠ screenshot |

### Six recognised document types

The Vision OCR classifies each photo before extracting it:

| Type | Extracted fields |
|------|------------------|
| **business_card** | Name, function, company, email, phone |
| **invoice** | Vendor, number, lines, total, date |
| **screenshot** | Visible text, identified UI |
| **handwritten_note** | Transcription, confidence, language |
| **general_document** | Structured raw text |
| **not_document** | (Not a document — photo, landscape, etc.) |

> **Tip - Business card → Odoo**
>
> Extracted business cards can be sent straight into Odoo as a contact through the contact ingestion workflow.

---

## 3. How? — Technical implementation

### Question Hub: a question's journey

**1. Reception** — The CLI sends a webhook with the options, the question type (single/multi-select), and a timeout (300s by default).

**2. Formatting** — N8N builds an inline keyboard sized to the option count. Beyond 4 options, pagination kicks in automatically (4 options per page with ◀️ ▶️ arrows).

**3. Interaction** — The user clicks options. In multi-select, every click toggles the ✅ and updates the keyboard in real time (via `editMessageReplyMarkup`). Selections are persisted in a Data Table to survive page changes.

**4. Confirmation** — A [Confirm] button validates the choices. The answer is returned to the CLI through the callback.

**5. Free text** — An optional [Other] button enables Telegram's `ForceReply` mode to capture a free-form answer.

Everything happens inside a single Telegram message — no spam of one message per interaction.

### Vision OCR: a photo's journey

**1. Classification** — Gemini Flash analyses the base64 image and returns a document type with a confidence score.

**2. Specialised extraction** — Depending on the detected type, a specific prompt is sent to Gemini Vision. Each branch extracts different fields:

```mermaid
flowchart TD
  Photo["Photo received"]
  Class["Gemini Flash · Classification"]
  BC["business_card → name, email, phone, company"]
  INV["invoice → vendor, number, lines, total"]
  SC["screenshot → text, UI"]
  HW["handwritten_note → transcription, confidence"]
  Gen["general_document → structured text"]
  NotDoc["not_document → ignore"]

  Photo --> Class --> BC
  Class --> INV
  Class --> SC
  Class --> HW
  Class --> Gen
  Class --> NotDoc
```

**3. Normalisation** — The reply is formatted as sanitised HTML for Telegram with a uniform contract: `{status, docType, extracted, text}`.

### CLI Ollama profiles

The profile system lets you configure different AI personas with specific knowledge and tools. Each profile is a YAML file in `/workspace/profiles/` that defines:

- A specialised system prompt
- A knowledge base (Markdown files injected into the context)
- A list of allowed MCP tools (ceiling semantic)
- Tools requiring Telegram approval

Two profiles are deployed: `error-analyst` (DLQ analysis, 5 read tools) and `n8n-admin` (workflow administration, 5 read + 2 write with confirmation).

---

## 4. What if? — Perspectives and limits

### Current limits

| Limit | Impact | Mitigation |
|-------|--------|------------|
| **5-min timeout** | Question expires without an answer | Sufficient for interactive use |
| **4 options/page** | Lots of pages with 20+ options | Pagination preserving selections |
| **OCR depends on Gemini** | No local fallback | Fast and reliable in practice |

### Evolution scenarios

**If more accurate OCR is needed**:
- Add specialised models (Tesseract for standard fonts)
- Post-process invoices with total validation
- Direct integration with Odoo accounting

**If multi-user use**:
- Map CLI sessions to Telegram users
- Queue if several questions arrive at once

---

## Related pages

### Workflows
- [Telegram Orchestrator](/en/workflows/telegram-orchestrator/) — Photo and callback routing
- [Voice transcription](/en/workflows/voice-transcription/) — Another input modality

### Infrastructure
- [AI Stack](/en/infrastructure/ai-stack/) — CLI Ollama and Gemini Vision

## Metadonnees agent

- Cet article est issu du blog GuiGPaP Lab.
- Contexte global du blog: https://blog.guigpap.com/llms.txt
- Contact auteur: https://odoo.guigpap.com/mon-cv
- Licence: CC-BY-SA 4.0