baoyu-url-to-markdown
Fetch any URL and convert to markdown using Chrome CDP. Supports two modes - auto-capture on page load, or wait for user signal (for pages requiring login). Use when user wants to save a webpage as markdown.
Install
mkdir -p .claude/skills/baoyu-url-to-markdown && curl -L -o skill.zip "https://mcp.directory/api/skills/download/2005" && unzip -o skill.zip -d .claude/skills/baoyu-url-to-markdown && rm skill.zipInstalls to .claude/skills/baoyu-url-to-markdown
About this skill
URL to Markdown
Fetches any URL via Chrome CDP, saves the rendered HTML snapshot, and converts it to clean markdown.
Script Directory
Important: All scripts are located in the scripts/ subdirectory of this skill.
Agent Execution Instructions:
- Determine this SKILL.md file's directory path as
{baseDir} - Script path =
{baseDir}/scripts/<script-name>.ts - Resolve
${BUN_X}runtime: ifbuninstalled →bun; ifnpxavailable →npx -y bun; else suggest installing bun - Replace all
{baseDir}and${BUN_X}in this document with actual values
Script Reference:
| Script | Purpose |
|---|---|
scripts/main.ts | CLI entry point for URL fetching |
scripts/html-to-markdown.ts | Defuddle-first conversion with automatic legacy fallback |
Preferences (EXTEND.md)
Check EXTEND.md existence (priority order):
# macOS, Linux, WSL, Git Bash
test -f .baoyu-skills/baoyu-url-to-markdown/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-url-to-markdown/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-url-to-markdown/EXTEND.md" && echo "user"
# PowerShell (Windows)
if (Test-Path .baoyu-skills/baoyu-url-to-markdown/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-url-to-markdown/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-url-to-markdown/EXTEND.md") { "user" }
┌────────────────────────────────────────────────────────┬───────────────────┐ │ Path │ Location │ ├────────────────────────────────────────────────────────┼───────────────────┤ │ .baoyu-skills/baoyu-url-to-markdown/EXTEND.md │ Project directory │ ├────────────────────────────────────────────────────────┼───────────────────┤ │ $HOME/.baoyu-skills/baoyu-url-to-markdown/EXTEND.md │ User home │ └────────────────────────────────────────────────────────┴───────────────────┘
┌───────────┬───────────────────────────────────────────────────────────────────────────┐ │ Result │ Action │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Found │ Read, parse, apply settings │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Not found │ MUST run first-time setup (see below) — do NOT silently create defaults │ └───────────┴───────────────────────────────────────────────────────────────────────────┘
EXTEND.md Supports: Download media by default | Default output directory | Default capture mode | Timeout settings
First-Time Setup (BLOCKING)
CRITICAL: When EXTEND.md is not found, you MUST use AskUserQuestion to ask the user for their preferences before creating EXTEND.md. NEVER create EXTEND.md with defaults without asking. This is a BLOCKING operation — do NOT proceed with any conversion until setup is complete.
Use AskUserQuestion with ALL questions in ONE call:
Question 1 — header: "Media", question: "How to handle images and videos in pages?"
- "Ask each time (Recommended)" — After saving markdown, ask whether to download media
- "Always download" — Always download media to local imgs/ and videos/ directories
- "Never download" — Keep original remote URLs in markdown
Question 2 — header: "Output", question: "Default output directory?"
- "url-to-markdown (Recommended)" — Save to ./url-to-markdown/{domain}/{slug}.md
- (User may choose "Other" to type a custom path)
Question 3 — header: "Save", question: "Where to save preferences?"
- "User (Recommended)" — ~/.baoyu-skills/ (all projects)
- "Project" — .baoyu-skills/ (this project only)
After user answers, create EXTEND.md at the chosen location, confirm "Preferences saved to [path]", then continue.
Full reference: references/config/first-time-setup.md
Supported Keys
| Key | Default | Values | Description |
|---|---|---|---|
download_media | ask | ask / 1 / 0 | ask = prompt each time, 1 = always download, 0 = never |
default_output_dir | empty | path or empty | Default output directory (empty = ./url-to-markdown/) |
EXTEND.md → CLI mapping:
| EXTEND.md key | CLI argument | Notes |
|---|---|---|
download_media: 1 | --download-media | |
default_output_dir: ./posts/ | --output-dir ./posts/ | Directory path. Do NOT pass to -o (which expects a file path) |
Value priority:
- CLI arguments (
--download-media,-o,--output-dir) - EXTEND.md
- Skill defaults
Features
- Chrome CDP for full JavaScript rendering
- Two capture modes: auto or wait-for-user
- Save rendered HTML as a sibling
-captured.htmlfile - Clean markdown output with metadata
- Defuddle-first markdown conversion with automatic fallback to the pre-Defuddle extractor from git history
- Handles login-required pages via wait mode
- Download images and videos to local directories
Usage
# Auto mode (default) - capture when page loads
${BUN_X} {baseDir}/scripts/main.ts <url>
# Wait mode - wait for user signal before capture
${BUN_X} {baseDir}/scripts/main.ts <url> --wait
# Save to specific file
${BUN_X} {baseDir}/scripts/main.ts <url> -o output.md
# Save to a custom output directory (auto-generates filename)
${BUN_X} {baseDir}/scripts/main.ts <url> --output-dir ./posts/
# Download images and videos to local directories
${BUN_X} {baseDir}/scripts/main.ts <url> --download-media
Options
| Option | Description |
|---|---|
<url> | URL to fetch |
-o <path> | Output file path — must be a file path, not directory (default: auto-generated) |
--output-dir <dir> | Base output directory — auto-generates {dir}/{domain}/{slug}.md (default: ./url-to-markdown/) |
--wait | Wait for user signal before capturing |
--timeout <ms> | Page load timeout (default: 30000) |
--download-media | Download image/video assets to local imgs/ and videos/, and rewrite markdown links to local relative paths |
Capture Modes
| Mode | Behavior | Use When |
|---|---|---|
| Auto (default) | Capture on network idle | Public pages, static content |
Wait (--wait) | User signals when ready | Login-required, lazy loading, paywalls |
Wait mode workflow:
- Run with
--wait→ script outputs "Press Enter when ready" - Ask user to confirm page is ready
- Send newline to stdin to trigger capture
Output Format
Each run saves two files side by side:
- Markdown: YAML front matter with
url,title,description,author,published, optionalcoverImage, andcaptured_at, followed by converted markdown content - HTML snapshot:
*-captured.html, containing the rendered page HTML captured from Chrome
The HTML snapshot is saved before any markdown media localization, so it stays a faithful capture of the page DOM used for conversion.
Output Directory
Default: url-to-markdown/<domain>/<slug>.md
With --output-dir ./posts/: ./posts/<domain>/<slug>.md
HTML snapshot path uses the same basename:
-
url-to-markdown/<domain>/<slug>-captured.html -
./posts/<domain>/<slug>-captured.html -
<slug>: From page title or URL path (kebab-case, 2-6 words) -
Conflict resolution: Append timestamp
<slug>-YYYYMMDD-HHMMSS.md
When --download-media is enabled:
- Images are saved to
imgs/next to the markdown file - Videos are saved to
videos/next to the markdown file - Markdown media links are rewritten to local relative paths
Conversion Fallback
Conversion order:
- Try Defuddle first
- If Defuddle throws, cannot load, returns obviously incomplete markdown, or captures lower-quality content than the legacy pipeline, automatically fall back to the pre-Defuddle extractor
- The fallback path uses the older Readability/selector/Next.js-data based HTML-to-Markdown implementation recovered from git history
CLI output will show:
Converter: defuddlewhen Defuddle succeedsConverter: legacy:...plusFallback used: ...when fallback was needed
Media Download Workflow
Based on download_media setting in EXTEND.md:
| Setting | Behavior |
|---|---|
1 (always) | Run script with --download-media flag |
0 (never) | Run script without --download-media flag |
ask (default) | Follow the ask-each-time flow below |
Ask-Each-Time Flow
- Run script without
--download-media→ markdown saved - Check saved markdown for remote media URLs (
https://in image/video links) - If no remote media found → done, no prompt needed
- If remote media found → use
AskUserQuestion:- header: "Media", question: "Download N images/videos to local files?"
- "Yes" — Download to local directories
- "No" — Keep remote URLs
- If user confirms → run script again with
--download-media(overwrites markdown with localized links)
Environment Variables
| Variable | Description |
|---|---|
URL_CHROME_PATH | Custom Chrome executable path |
URL_DATA_DIR | Custom data directory |
URL_CHROME_PROFILE_DIR | Custom Chrome profile directory |
Troubleshooting: Chrome not found → set URL_CHROME_PATH. Timeout → increase --timeout. Complex pages → try --wait mode. If markdown quality is poor, inspect the saved -captured.html and check whether the run logged a legacy fallback.
Extension Support
Custom configurations via EXTEND.md. See Preferences section for paths and supported options.
More by JimLiu
View all skills by JimLiu →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
fastapi-templates
wshobson
Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.
Related MCP Servers
Browse all serversFetch and Convert turns web content into Markdown using JSDOM and Turndown—perfect for link markdown and md format needs
docs.rs — Search Rust crates, fetch READMEs and docs, and explore API items with markdown-converted scraping for fast, p
Easily convert markdown to PDF using Markitdown MCP server. Supports HTTP, STDIO, and SSE for fast converting markdown t
Web Fetcher uses Playwright for reliable data web scraping and extraction from JavaScript-heavy websites, returning clea
Convert files easily with File Format Converter (Pandoc): transform PDF, HTML, Markdown, HEIC to JPG, JPG to PDF, and mo
Transform your notes with Markdown Mindmap—convert Markdown into interactive mind maps for organized, visual knowledge r
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.