YouTube Digest

1. What? — Definition and context

The YouTube Digest workflow is a full pipeline that extracts, transcribes and summarises YouTube videos. It supports multiple input sources and generates TLDR summaries with on-demand enrichment via Telegram.

Features

Feature	Description
3 input sources	Schedule (playlists), Telegram (URL), Watch Later
Smart transcription	Prefers YouTube captions (free), Whisper fallback
TLDR summary	Key points extracted via Claude
Vector storage	Embeddings in Qdrant for semantic search
On-demand enrichment	Telegram buttons to dig deeper

Input sources

Source	Trigger	Usage
Schedule	Configured cron	Tech/news playlists
Telegram URL	Message to the bot	One-off video
Watch Later	YouTube playlist	Saved videos

2. Why? — Stakes and motivations

Problems solved

Problem	Without digest	With digest
Watch time	Watch the whole video	TLDR in 30 seconds
No search	Impossible to find a passage	Semantic search via Qdrant
Manual notes	Take notes while watching	Key points auto-extracted
Video backlog	Watch Later piles up	Automatic processing

Why prioritise YouTube captions?

Method	Cost	Speed	Quality
YouTube captions	Free	Instant	Variable (auto-generated)
Whisper API	~$0.36/h	~10-30s	Excellent

3. How? — Technical implementation

Architecture

Smart transcription

Long-file handling:

Duration	Strategy
< 25 min	Whisper direct
≥ 25 min	Split into chunks (~25MB limit)
> 2h	Skip or confirmation

TLDR summary (Claude)

Prompt:

{
  "model": "claude-haiku-yolo",
  "prompt": `Analyse this YouTube transcript and generate:
1. TLDR (2-3 sentences max)
2. Key points (5 max, bullet points)
3. Topics (3-5 tags)
4. Sentiment (educational/entertaining/promotional/technical)
5. Language (ISO code)

Transcript:
${transcript}`
}

Output:

{
  "tldr": "2-3 sentence concise summary",
  "key_points": ["point1", "point2", "point3"],
  "topics": ["tech", "ai", "tutorial"],
  "sentiment": "educational",
  "language": "en"
}

Telegram notification

📺 NEW DIGEST

**How AI is Changing Software Development**
Channel: Fireship • 12:34

**TLDR:**
AI tools like Copilot and Cursor are transforming dev.
Focus on productivity gains and risks.

**Key points:**
• GitHub Copilot boosts velocity by 40%
• Risk: junior developers less autonomous
• Cursor = promising AI-first IDE
• Trend: "vibe coding" vs "intentional coding"

Tags: #ai #development #tools

[📜 Full transcript] [🧠 Atomic notes] [📎 Link to Odoo] [❓ Q&A]

On-demand enrichment

Button	Action
📜 Full transcript	Full structured transcription with chapters
🧠 Atomic notes	Zettelkasten notes (one per concept)
📎 Link to Odoo	Link to a CRM contact/project
❓ Q&A	Generate flashcard-style Q&A

Qdrant storage

{
  "collection": "youtube_digests",
  "vector": embedding,
  "payload": {
    "video_id": "dQw4w9WgXcQ",
    "title": "...",
    "tldr": "...",
    "key_points": [...],
    "topics": [...]
  }
}

N8N Data Table

Column	Type	Description
video_id	string	YouTube video ID
title	string	Video title
channel	string	Channel name
duration	number	Duration in seconds
processed_at	datetime	Processing date
tldr	text	Short summary
key_points	json	Key points list
topics	json	Extracted tags
qdrant_id	string	Qdrant vector ID

4. What if? — Outlook and limits

Cost estimation

Operation	10-min video	60-min video
YouTube captions	Free	Free
Whisper (if needed)	~$0.06	~$0.36
Claude TLDR	~$0.01	~$0.05
Embedding	~$0.0001	~$0.001
Total	~$0.01-0.07	~$0.05-0.41

Current limits

Limit	Impact	Mitigation
Videos > 2h	Possible timeout	Manual confirmation
Missing captions	Whisper cost	Acceptable for important videos
Multi-language	Summary in source language	Optional translation

Evolution scenarios

If multi-language is needed:

Auto-detect language
Optional summary translation
Multilingual embeddings

If video volume grows:

Filter by relevance score
Grouped daily digest
Priority queue

If advanced search is needed:

Web interface for Qdrant search
Filters by topic/channel
Similar video suggestions

Useful commands

# Test video_id extraction
echo "https://www.youtube.com/watch?v=dQw4w9WgXcQ" | \
  grep -oP '(?<=v=)[^&]+'

# Check available captions
yt-dlp --list-subs "https://www.youtube.com/watch?v=VIDEO_ID"

# Download audio for testing
yt-dlp -x -f bestaudio "https://www.youtube.com/watch?v=VIDEO_ID"

# Semantic search in Qdrant
curl -X POST "http://qdrant:6333/collections/youtube_digests/points/search" \
  -H "Content-Type: application/json" \
  -d '{"vector": [...], "limit": 5}'

Troubleshooting

Problem	Check
No captions	Video without subtitles → Whisper fallback
Whisper timeout	File too long? Increase timeout
Claude error	Quota exhausted? Check claude-ollama usage
Qdrant 404	Collection exists? Create if needed

Workflows

Telegram Orchestrator — URL submission
Notification Hub — Notification routing

Infrastructure

AI Stack — Claude Ollama + Qdrant
N8N Queue Mode — Backend workers