Skip to content

YouTube Digest

The YouTube Digest workflow is a full pipeline that extracts, transcribes and summarises YouTube videos. It supports multiple input sources and generates TLDR summaries with on-demand enrichment via Telegram.

FeatureDescription
3 input sourcesSchedule (playlists), Telegram (URL), Watch Later
Smart transcriptionPrefers YouTube captions (free), Whisper fallback
TLDR summaryKey points extracted via Claude
Vector storageEmbeddings in Qdrant for semantic search
On-demand enrichmentTelegram buttons to dig deeper
SourceTriggerUsage
ScheduleConfigured cronTech/news playlists
Telegram URLMessage to the botOne-off video
Watch LaterYouTube playlistSaved videos

ProblemWithout digestWith digest
Watch timeWatch the whole videoTLDR in 30 seconds
No searchImpossible to find a passageSemantic search via Qdrant
Manual notesTake notes while watchingKey points auto-extracted
Video backlogWatch Later piles upAutomatic processing
MethodCostSpeedQuality
YouTube captionsFreeInstantVariable (auto-generated)
Whisper API~$0.36/h~10-30sExcellent

already

new

Input sources

Schedule · playlists

Telegram URL

Watch Later · YouTube

Normalize Input · extract video_id

Already processed ?

Skip

Get Metadata · title, duration

Transcription · captions or Whisper

TLDR via cli-ollama

Storage · Qdrant + Data Table

Telegram notification + enrichment buttons

Yes

No

Try YouTube Captions

Captions found ?

Parse SRT / VTT

Download Audio · yt-dlp

Whisper API

Clean Transcript

Long-file handling:

DurationStrategy
< 25 minWhisper direct
≥ 25 minSplit into chunks (~25MB limit)
> 2hSkip or confirmation

Prompt:

{
"model": "claude-haiku-yolo",
"prompt": `Analyse this YouTube transcript and generate:
1. TLDR (2-3 sentences max)
2. Key points (5 max, bullet points)
3. Topics (3-5 tags)
4. Sentiment (educational/entertaining/promotional/technical)
5. Language (ISO code)
Transcript:
${transcript}`
}

Output:

{
"tldr": "2-3 sentence concise summary",
"key_points": ["point1", "point2", "point3"],
"topics": ["tech", "ai", "tutorial"],
"sentiment": "educational",
"language": "en"
}
📺 NEW DIGEST
**How AI is Changing Software Development**
Channel: Fireship • 12:34
**TLDR:**
AI tools like Copilot and Cursor are transforming dev.
Focus on productivity gains and risks.
**Key points:**
• GitHub Copilot boosts velocity by 40%
• Risk: junior developers less autonomous
• Cursor = promising AI-first IDE
• Trend: "vibe coding" vs "intentional coding"
Tags: #ai #development #tools
[📜 Full transcript] [🧠 Atomic notes] [📎 Link to Odoo] [❓ Q&A]
ButtonAction
📜 Full transcriptFull structured transcription with chapters
🧠 Atomic notesZettelkasten notes (one per concept)
📎 Link to OdooLink to a CRM contact/project
❓ Q&AGenerate flashcard-style Q&A
{
"collection": "youtube_digests",
"vector": embedding,
"payload": {
"video_id": "dQw4w9WgXcQ",
"title": "...",
"tldr": "...",
"key_points": [...],
"topics": [...]
}
}
ColumnTypeDescription
video_idstringYouTube video ID
titlestringVideo title
channelstringChannel name
durationnumberDuration in seconds
processed_atdatetimeProcessing date
tldrtextShort summary
key_pointsjsonKey points list
topicsjsonExtracted tags
qdrant_idstringQdrant vector ID

Operation10-min video60-min video
YouTube captionsFreeFree
Whisper (if needed)~$0.06~$0.36
Claude TLDR~$0.01~$0.05
Embedding~$0.0001~$0.001
Total~$0.01-0.07~$0.05-0.41
LimitImpactMitigation
Videos > 2hPossible timeoutManual confirmation
Missing captionsWhisper costAcceptable for important videos
Multi-languageSummary in source languageOptional translation

If multi-language is needed:

  • Auto-detect language
  • Optional summary translation
  • Multilingual embeddings

If video volume grows:

  • Filter by relevance score
  • Grouped daily digest
  • Priority queue

If advanced search is needed:

  • Web interface for Qdrant search
  • Filters by topic/channel
  • Similar video suggestions
Fenêtre de terminal
# Test video_id extraction
echo "https://www.youtube.com/watch?v=dQw4w9WgXcQ" | \
grep -oP '(?<=v=)[^&]+'
# Check available captions
yt-dlp --list-subs "https://www.youtube.com/watch?v=VIDEO_ID"
# Download audio for testing
yt-dlp -x -f bestaudio "https://www.youtube.com/watch?v=VIDEO_ID"
# Semantic search in Qdrant
curl -X POST "http://qdrant:6333/collections/youtube_digests/points/search" \
-H "Content-Type: application/json" \
-d '{"vector": [...], "limit": 5}'
ProblemCheck
No captionsVideo without subtitles → Whisper fallback
Whisper timeoutFile too long? Increase timeout
Claude errorQuota exhausted? Check claude-ollama usage
Qdrant 404Collection exists? Create if needed