tubescribe

Name: tubescribe
Author: openclaw

5views

5installs

YouTube video summarizer with speaker detection, formatted documents, and audio output. Use when user sends a YouTube URL or asks to summarize/transcribe a YouTube video.

Install

mkdir -p .claude/skills/tubescribe && curl -L -o skill.zip "https://mcp.directory/api/skills/download/6945" && unzip -o skill.zip -d .claude/skills/tubescribe && rm skill.zip

Installs to .claude/skills/tubescribe

About this skill

TubeScribe 🎬

Turn any YouTube video into a polished document + audio summary.

Drop a YouTube link → get a beautiful transcript with speaker labels, key quotes, timestamps that link back to the video, and an audio summary you can listen to on the go.

💸 Free & No Paid APIs

No subscriptions or API keys — works out of the box
Local processing — transcription, speaker detection, and TTS run on your machine
Network access — fetching from YouTube (captions, metadata, comments) requires internet
No data uploaded — nothing is sent to external services; all processing stays on your machine
Safe sub-agent — spawned sub-agent has strict instructions: no software installation, no network calls beyond YouTube

✨ Features

📄 Transcript with summary and key quotes — Export as DOCX, HTML, or Markdown
🎯 Smart Speaker Detection — Automatically identifies participants
🔊 Audio Summaries — Listen to key points (MP3/WAV)
📝 Clickable Timestamps — Every quote links directly to that moment in the video
💬 YouTube Comments — Viewer sentiment analysis and best comments
📋 Queue Support — Send multiple links, they get processed in order
🚀 Non-Blocking Workflow — Conversation continues while video processes in background

🎬 Works With Any Video

Interviews & podcasts (multi-speaker detection)
Lectures & tutorials (single speaker)
Music videos (lyrics extraction)
News & documentaries
Any YouTube content with captions

Quick Start

When user sends a YouTube URL:

Spawn sub-agent with the full pipeline task immediately
Reply: "🎬 TubeScribe is processing — I'll let you know when it's ready!"
Continue conversation (don't wait!)
Sub-agent notification will announce completion with title and details

DO NOT BLOCK — spawn and move on instantly.

First-Time Setup

Run setup to check dependencies and configure defaults:

python skills/tubescribe/scripts/setup.py

This checks: summarize CLI, pandoc, ffmpeg, Kokoro TTS

Full Workflow (Single Sub-Agent)

Spawn ONE sub-agent that does the entire pipeline:

sessions_spawn(
    task=f"""
## TubeScribe: Process {youtube_url}

⚠️ CRITICAL: Do NOT install any software.
No pip, brew, curl, venv, or binary downloads.
If a tool is missing, STOP and report what's needed.

Run the COMPLETE pipeline — do not stop until all steps are done.

### Step 1: Extract
```bash
python3 skills/tubescribe/scripts/tubescribe.py "{youtube_url}"

Note the Source and Output paths printed by the script. Use those exact paths in subsequent steps.

Step 2: Read source JSON

Read the Source path from Step 1 output and note:

metadata.title (for filename)
metadata.video_id
metadata.channel, upload_date, duration_string

Step 3: Create formatted markdown

Write to the Output path from Step 1:

# **<title>**

Video info block — Channel, Date, Duration, URL (clickable). Empty line between each field.

## **Participants** — table with bold headers:

| **Name** | **Role** | **Description** |
|----------|----------|-----------------|

## **Summary** — 3-5 paragraphs of prose

## **Key Quotes** — 5 best with clickable YouTube timestamps. Format each as:
```
"Quote text here." - [12:34](https://www.youtube.com/watch?v=ID&t=754s)

"Another quote." - [25:10](https://www.youtube.com/watch?v=ID&t=1510s)
```
Use regular dash -, NOT em dash —. Do NOT use blockquotes >. Plain paragraphs only.

## **Viewer Sentiment** (if comments exist)

## **Best Comments** (if comments exist) — Top 5, NO lines between them:
```
Comment text here.

*- ▲ 123 @AuthorName*

Next comment text here.

*- ▲ 45 @AnotherAuthor*
```
Attribution line: dash + italic. Just blank line between comments, NO --- separators.

## **Full Transcript** — merge segments, speaker labels, clickable timestamps

Step 4: Create DOCX

Clean the title for filename (remove special chars), then:

pandoc <output_path> -o ~/Documents/TubeScribe/<safe_title>.docx

Step 5: Generate audio

Write the summary text to a temp file, then use TubeScribe's built-in audio generation:

# Write summary to temp file (use python3 to write, avoids shell escaping issues)
python3 -c "
text = '''YOUR SUMMARY TEXT HERE'''
with open('<temp_dir>/tubescribe_<video_id>_summary.txt', 'w') as f:
    f.write(text)
"

# Generate audio (auto-detects engine, voice, format from config)
python3 skills/tubescribe/scripts/tubescribe.py \
  --generate-audio <temp_dir>/tubescribe_<video_id>_summary.txt \
  --audio-output ~/Documents/TubeScribe/<safe_title>_summary

This reads ~/.tubescribe/config.json and uses the configured TTS engine (mlx/kokoro/builtin), voice blend, and speed automatically. Output format (mp3/wav) comes from config.

Step 6: Cleanup

python3 skills/tubescribe/scripts/tubescribe.py --cleanup <video_id>

Step 7: Open folder

open ~/Documents/TubeScribe/

Report

Tell what was created: DOCX name, MP3 name + duration, video stats. """, label="tubescribe", runTimeoutSeconds=900, cleanup="delete" )


**After spawning, reply immediately:**
> 🎬 TubeScribe is processing - I'll let you know when it's ready!
Then continue the conversation. The sub-agent notification announces completion.

## Configuration

Config file: `~/.tubescribe/config.json`

```json
{
  "output": {
    "folder": "~/Documents/TubeScribe",
    "open_folder_after": true,
    "open_document_after": false,
    "open_audio_after": false
  },
  "document": {
    "format": "docx",
    "engine": "pandoc"
  },
  "audio": {
    "enabled": true,
    "format": "mp3",
    "tts_engine": "mlx"
  },
  "mlx_audio": {
    "path": "~/.openclaw/tools/mlx-audio",
    "model": "mlx-community/Kokoro-82M-bf16",
    "voice": "af_heart",
    "lang_code": "a",
    "speed": 1.05
  },
  "kokoro": {
    "path": "~/.openclaw/tools/kokoro",
    "voice_blend": { "af_heart": 0.6, "af_sky": 0.4 },
    "speed": 1.05
  },
  "processing": {
    "subagent_timeout": 600,
    "cleanup_temp_files": true
  }
}

Output Options

Option	Default	Description
`output.folder`	`~/Documents/TubeScribe`	Where to save files
`output.open_folder_after`	`true`	Open output folder when done
`output.open_document_after`	`false`	Auto-open generated document
`output.open_audio_after`	`false`	Auto-open generated audio summary

Document Options

Option	Default	Values	Description
`document.format`	`docx`	`docx`, `html`, `md`	Output format
`document.engine`	`pandoc`	`pandoc`	Converter for DOCX (falls back to HTML)

Audio Options

Option	Default	Values	Description
`audio.enabled`	`true`	`true`, `false`	Generate audio summary
`audio.format`	`mp3`	`mp3`, `wav`	Audio format (mp3 needs ffmpeg)
`audio.tts_engine`	`mlx`	`mlx`, `kokoro`, `builtin`	TTS engine (mlx = fastest on Apple Silicon)

MLX-Audio Options (preferred on Apple Silicon)

Option	Default	Description
`mlx_audio.path`	`~/.openclaw/tools/mlx-audio`	mlx-audio venv location
`mlx_audio.model`	`mlx-community/Kokoro-82M-bf16`	MLX model to use
`mlx_audio.voice`	`af_heart`	Voice preset (used if no voice_blend)
`mlx_audio.voice_blend`	`{af_heart: 0.6, af_sky: 0.4}`	Custom voice mix (weighted blend)
`mlx_audio.lang_code`	`a`	Language code (a=US English)
`mlx_audio.speed`	`1.05`	Playback speed (1.0 = normal, 1.05 = 5% faster)

Kokoro PyTorch Options (fallback)

Option	Default	Description
`kokoro.path`	`~/.openclaw/tools/kokoro`	Kokoro repo location
`kokoro.voice_blend`	`{af_heart: 0.6, af_sky: 0.4}`	Custom voice mix
`kokoro.speed`	`1.05`	Playback speed (1.0 = normal, 1.05 = 5% faster)

Processing Options

Option	Default	Description
`processing.subagent_timeout`	`600`	Seconds for sub-agent (increase for long videos)
`processing.cleanup_temp_files`	`true`	Remove /tmp files after completion

Comment Options

Option	Default	Description
`comments.max_count`	`50`	Number of comments to fetch
`comments.timeout`	`90`	Timeout for comment fetching (seconds)

Queue Options

Option	Default	Description
`queue.stale_minutes`	`30`	Consider a processing job stale after this many minutes

Output Structure

~/Documents/TubeScribe/
├── {Video Title}.html         # Formatted document (or .docx / .md)
└── {Video Title}_summary.mp3  # Audio summary (or .wav)

After generation, opens the folder (not individual files) so you can access everything.

Dependencies

Required:

summarize CLI — brew install steipete/tap/summarize
Python 3.8+

Optional (better quality):

pandoc — DOCX output: brew install pandoc
ffmpeg — MP3 audio: brew install ffmpeg
yt-dlp — YouTube comments: brew install yt-dlp
mlx-audio — Fastest TTS on Apple Silicon: pip install mlx-audio (uses MLX backend for Kokoro)
Kokoro TTS — PyTorch fallback: see https://github.com/hexgrad/kokoro

yt-dlp Search Paths

TubeScribe checks these locations (in order):

Priority	Path	Source
1	`which yt-dlp`	System PATH
2	`/opt/homebrew/bin/yt-dlp`	Homebrew (Apple Silicon)
3	`/usr/local/bin/yt-dlp`	Homebrew (Intel) / Linux
4	`~/.local/bin/yt-dlp`	pip install --user
5	`~/.local/pipx/venvs/yt-dlp/bin/yt-dlp`	pipx
6	`~/.openclaw/tools/yt-dlp/yt-dlp`	Tub

Content truncated.

More by openclaw

View all skills by openclaw →

a-stock-analysis

openclaw

A股实时行情与分时量能分析。获取沪深股票实时价格、涨跌、成交量，分析分时量能分布（早盘/尾盘放量）、主力动向（抢筹/出货信号）、涨停封单。支持持仓管理和盈亏分析。Use when: (1) 查询A股实时行情, (2) 分析主力资金动向, (3) 查看分时成交量分布, (4) 管理股票持仓, (5) 分析持仓盈亏。

747287

Fix, create, or validate FiveM server resources for QBCore/ESX (config.lua, fxmanifest.lua, items, housing/furniture, scripts, MLOs). Use when asked to debug resource errors, convert ESX↔QB, update fxmanifest versions, add items, or source scripts from GitHub. Also use for SSH key generation for SFTP access.

390243

research-paper-writer

openclaw

Creates formal academic research papers following IEEE/ACM formatting standards with proper structure, citations, and scholarly writing style. Use when the user asks to write a research paper, academic paper, or conference paper on any topic.

81168

keyword-research

openclaw

Discovers high-value keywords with search intent analysis, difficulty assessment, and content opportunity mapping. Essential for starting any SEO or GEO content strategy.

438107

html-to-ppt

openclaw

Convert HTML/Markdown to PowerPoint presentations using Marp

32886

weread

openclaw

WeChat Reading (微信读书) CLI tool for fetching notes and highlights. Use when: (1) user asks about weread/微信读书 notes or highlights, (2) fetching today's or recent reading notes, (3) exporting book highlights, (4) managing reading bookshelf, (5) any task involving reading notes from WeChat Reading.

11085

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

2,8182,494

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

2,1451,638

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

3,7251,627

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

2,2521,457

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

2,4331,211

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

1,938963

Related MCP Servers

Browse all servers

Summarization

Summarization provides efficient text summarization and AI summarizer tools to process large datasets, including YouTube video summarizer capabilities.

370 tools

YouTube Transcripts

Extract and analyze YouTube transcripts in multiple languages. Use our YouTube transcriptor to easily transcribe for YouTube videos.

4881 tools

Kagi Search

Supercharge AI tools with Kagi MCP: fast google web search API, powerful ai summarizer, and seamless ai summary tool integration.

3160 tools

Video Editor

AI-powered video editor that integrates Video Jungle for natural-language YouTube video search, automated clip generation, and fast content editing.

2530 tools

YouTube Downloader

Easily download videos or convert YouTube to MP3/MP4 with our YouTube downloader for quick content analysis using yt-dlp integration.

2220 tools

Fetch (Web Content & YouTube Transcripts)

Fetch is a web scraping tool that extracts web content and YouTube transcripts, converting HTML to Markdown with accurate timestamps.

1572 tools

Install

mkdir -p .claude/skills/tubescribe && curl -L -o skill.zip "https://mcp.directory/api/skills/download/6945" && unzip -o skill.zip -d .claude/skills/tubescribe && rm skill.zip

Installs to .claude/skills/tubescribe

Stats

Views

Installs

Author

openclaw

7 skills published

Links

Source Code

tubescribe

Install

About this skill

TubeScribe 🎬

💸 Free & No Paid APIs

✨ Features

🎬 Works With Any Video

Quick Start

First-Time Setup

Full Workflow (Single Sub-Agent)

Step 2: Read source JSON

Step 3: Create formatted markdown

Step 4: Create DOCX

Step 5: Generate audio

Step 6: Cleanup

Step 7: Open folder

Report

Output Options

Document Options

Audio Options

MLX-Audio Options (preferred on Apple Silicon)

Kokoro PyTorch Options (fallback)

Processing Options

Comment Options

Queue Options

Output Structure

Dependencies

yt-dlp Search Paths

More by openclaw

a-stock-analysis

fivem

research-paper-writer

keyword-research

html-to-ppt

weread

You might also like

ui-ux-pro-max

flutter-development

pdf-to-markdown

drawio-diagrams-enhanced

godot

nano-banana-pro

Related MCP Servers