Read Website Fast

Name: Read Website Fast
Rating: 4.5 (142 reviews)
Author: just-every

Extracts web content and converts it to clean Markdown using Mozilla Readability, with caching and rate limiting for efficient AI workflows.

Extracts web content and converts it to clean Markdown format using Mozilla Readability for intelligent article detection, with disk-based caching, robots.txt compliance, and concurrent crawling capabilities for fast content processing workflows.

135674 views22RemoteLocal (stdio)

search web productivity

GitHub

What it does

Extract article content from websites
Convert HTML to clean Markdown
Cache content locally for faster repeat access
Crawl multiple pages concurrently
Respect robots.txt and rate limits
Preserve links for knowledge graphs

Best for

AI agents analyzing web contentDocumentation research workflowsContent extraction for LLM processingBuilding knowledge bases from web sources

Token-efficient outputMozilla Readability engineDisk-based caching

About Read Website Fast

Read Website Fast is a community-built MCP server published by just-every that provides AI assistants with tools and capabilities via the Model Context Protocol. Extract web content and convert to clean Markdown. Fast data extraction from web pages with caching, robots.txt support, It is categorized under search web, productivity. This server exposes 1 tool that AI clients can invoke during conversations and coding sessions.

How to install

You can install Read Website Fast in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport. This server supports remote connections over HTTP, so no local installation is required.

License

Read Website Fast is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

Tools (1)

read_website

Fast, token-efficient web content extraction - ideal for reading documentation, analyzing content, and gathering information from websites. Converts to clean Markdown while preserving links and structure.

@just-every/mcp-read-website-fast

Fast, token-efficient web content extraction for AI agents - converts websites to clean Markdown.

Overview

Existing MCP web crawlers are slow and consume large quantities of tokens. This pauses the development process and provides incomplete results as LLMs need to parse whole web pages.

This MCP package fetches web pages locally, strips noise, and converts content to clean Markdown while preserving links. Designed for Claude Code, IDEs and LLM pipelines with minimal token footprint. Crawl sites locally with minimal dependencies.

Note: This package now uses @just-every/crawl for its core crawling and markdown conversion functionality.

Features

Fast startup using official MCP SDK with lazy loading for optimal performance
Content extraction using Mozilla Readability (same as Firefox Reader View)
HTML to Markdown conversion with Turndown + GFM support
Smart caching with SHA-256 hashed URLs
Polite crawling with robots.txt support and rate limiting
Concurrent fetching with configurable depth crawling
Stream-first design for low memory usage
Link preservation for knowledge graphs
Optional chunking for downstream processing

Installation

Claude Code

claude mcp add read-website-fast -s user -- npx -y @just-every/mcp-read-website-fast

VS Code

code --add-mcp '{"name":"read-website-fast","command":"npx","args":["-y","@just-every/mcp-read-website-fast"]}'

Cursor

cursor://anysphere.cursor-deeplink/mcp/install?name=read-website-fast&config=eyJyZWFkLXdlYnNpdGUtZmFzdCI6eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqdXN0LWV2ZXJ5L21jcC1yZWFkLXdlYnNpdGUtZmFzdCJdfX0=

JetBrains IDEs

Settings → Tools → AI Assistant → Model Context Protocol (MCP) → Add

Choose “As JSON” and paste:

{"command":"npx","args":["-y","@just-every/mcp-read-website-fast"]}

Or, in the chat window, type /add and fill in the same JSON—both paths land the server in a single step.

Raw JSON (works in any MCP client)

{
  "mcpServers": {
    "read-website-fast": {
      "command": "npx",
      "args": ["-y", "@just-every/mcp-read-website-fast"]
    }
  }
}

Drop this into your client’s mcp.json (e.g. .vscode/mcp.json, ~/.cursor/mcp.json, or .mcp.json for Claude).

Features

Fast startup using official MCP SDK with lazy loading for optimal performance
Content extraction using Mozilla Readability (same as Firefox Reader View)
HTML to Markdown conversion with Turndown + GFM support
Smart caching with SHA-256 hashed URLs
Polite crawling with robots.txt support and rate limiting
Concurrent fetching with configurable depth crawling
Stream-first design for low memory usage
Link preservation for knowledge graphs
Optional chunking for downstream processing

Available Tools

read_website - Fetches a webpage and converts it to clean markdown
- Parameters:
  - url (required): The HTTP/HTTPS URL to fetch
  - pages (optional): Maximum number of pages to crawl (default: 1, max: 100)

Available Resources

read-website-fast://status - Get cache statistics
read-website-fast://clear-cache - Clear the cache directory

Development Usage

Install

npm install
npm run build

Single page fetch

npm run dev fetch https://example.com/article

Crawl with depth

npm run dev fetch https://example.com --depth 2 --concurrency 5

Output formats

# Markdown only (default)
npm run dev fetch https://example.com

# JSON output with metadata
npm run dev fetch https://example.com --output json

# Both URL and markdown
npm run dev fetch https://example.com --output both

CLI Options

-p, --pages <number> - Maximum number of pages to crawl (default: 1)
-c, --concurrency <number> - Max concurrent requests (default: 3)
--no-robots - Ignore robots.txt
--all-origins - Allow cross-origin crawling
-u, --user-agent <string> - Custom user agent
--cache-dir <path> - Cache directory (default: .cache)
-t, --timeout <ms> - Request timeout in milliseconds (default: 30000)
-o, --output <format> - Output format: json, markdown, or both (default: markdown)

Clear cache

npm run dev clear-cache

Auto-Restart Feature

The MCP server includes automatic restart capability by default for improved reliability:

Automatically restarts the server if it crashes
Handles unhandled exceptions and promise rejections
Implements exponential backoff (max 10 attempts in 1 minute)
Logs all restart attempts for monitoring
Gracefully handles shutdown signals (SIGINT, SIGTERM)

For development/debugging without auto-restart:

# Run directly without restart wrapper
npm run serve:dev

Architecture

mcp/
├── src/
│   ├── crawler/        # URL fetching, queue management, robots.txt
│   ├── parser/         # DOM parsing, Readability, Turndown conversion
│   ├── cache/          # Disk-based caching with SHA-256 keys
│   ├── utils/          # Logger, chunker utilities
│   ├── index.ts        # CLI entry point
│   ├── serve.ts        # MCP server entry point
│   └── serve-restart.ts # Auto-restart wrapper

Development

# Run in development mode
npm run dev fetch https://example.com

# Build for production
npm run build

# Run tests
npm test

# Type checking
npm run typecheck

# Linting
npm run lint

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Add tests for new functionality
Submit a pull request

Troubleshooting

Cache Issues

npm run dev clear-cache

Timeout Errors

Increase timeout with -t flag
Check network connectivity
Verify URL is accessible

Content Not Extracted

Some sites block automated access
Try custom user agent with -u flag
Check if site requires JavaScript (not supported)

License

MIT

Alternatives

Browser Use

browser-use

79.9k

Browser Use lets LLMs and agents access and scrape any website in real time, making web scraping and web page scraping e

OfficialPopular

36616

GitHub

github

27.6k

Extend your developer tools with GitHub MCP Server for advanced automation, supporting GitHub Student and student packag

OfficialRemotePopular

4.5k232

Task Master

eyaltoledano

25.8k

Boost productivity with Task Master: an AI-powered tool for project management and agile development workflows, integrat

CommunityPopular

4.9k114

Mastra Docs

mastra-ai

21.8k

Mastra Docs: AI assistants with direct access to Mastra.ai’s full knowledge base for faster, smarter support and insight

OfficialPopular

3872

Related Skills

Browse all skills

ux-writing

Create user-centered, accessible interface copy (microcopy) for digital products including buttons, labels, error messages, notifications, forms, onboarding, empty states, success messages, and help text. Use when writing or editing any text that appears in apps, websites, or software interfaces, designing conversational flows, establishing voice and tone guidelines, auditing product content for consistency and usability, reviewing UI strings, or improving existing interface copy. Applies UX writing best practices based on four quality standards — purposeful, concise, conversational, and clear. Includes accessibility guidelines, research-backed benchmarks (sentence length, comprehension rates, reading levels), expanded error patterns, tone adaptation frameworks, and comprehensive reference materials.

crisp

Customer support via Crisp API. Use when the user asks to check, read, search, or respond to Crisp inbox messages. Requires Crisp website ID and plugin token (authenticated via environment variables CRISP_WEBSITE_ID, CRISP_TOKEN_ID, and CRISP_TOKEN_KEY).

google-official-seo-guide

Official Google SEO guide covering search optimization, best practices, Search Console, crawling, indexing, and improving website search visibility based on official Google documentation

119

last30days

Research a topic from the last 30 days on Reddit + X + Web, become an expert, and write copy-paste-ready prompts for the user's target tool.

browser-automation

Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from websites, take screenshots, fill forms, click buttons, or interact with web applications. Triggers include "browse", "navigate to", "go to website", "extract data from webpage", "screenshot", "web scraping", "fill out form", "click on", "search for on the web". When taking actions be as specific as possible.

seo-optimizer

Search Engine Optimization specialist for content strategy, technical SEO, keyword research, and ranking improvements. Use when optimizing website content, improving search rankings, conducting keyword analysis, or implementing SEO best practices. Expert in on-page SEO, meta tags, schema markup, and Core Web Vitals.