Scrapling Fetch

Name: Scrapling Fetch
Rating: 4.9 (83 reviews)
Author: cyberchitta

Fetches web page content that's normally blocked by bot detection systems, allowing AI to access protected websites that would otherwise be inaccessible.

Enables AI to access text content from websites protected by bot detection mechanisms through three protection levels (basic, stealth, max-stealth), retrieving complete pages or specific content patterns without manual copying.

68666 views13Local (stdio)

search web

GitHub

What it does

Fetch complete web pages bypassing bot detection
Extract specific content patterns with regex
Handle pagination automatically
Use three protection levels (basic, stealth, max-stealth)
Retrieve text and HTML content only

Best for

Accessing documentation on protected sitesRetrieving reference materials from bot-protected websitesLow-volume content retrieval for researchAI assistants needing access to blocked web content

Bypasses bot detection mechanismsThree stealth protection levelsOptimized for documentation retrieval

About Scrapling Fetch

Scrapling Fetch is a community-built MCP server published by cyberchitta that provides AI assistants with tools and capabilities via the Model Context Protocol. Scrapling Fetch enables secure web scraping with three protection levels to scrape any website and access content blocke It is categorized under search web. This server exposes 2 tools that AI clients can invoke during conversations and coding sessions.

How to install

You can install Scrapling Fetch in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

Scrapling Fetch is released under the Apache-2.0 license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

Tools (2)

s_fetch_page

Fetches a complete web page with pagination support. Retrieves content from websites with bot-detection avoidance. For best performance, start with 'basic' mode (fastest), then only escalate to 'stealth' or 'max-stealth' modes if basic mode fails. Content is returned as 'METADATA: {json}\n\n[content]' where metadata includes length information and truncation status. Args: url: URL to fetch mode: Fetching mode (basic, stealth, or max-stealth) format: Output format (html or markdown) max_length: Maximum number of characters to return. start_index: On return output starting at this character index, useful if a previous fetch was truncated and more content is required.

s_fetch_pattern

Extracts content matching regex patterns from web pages. Retrieves specific content from websites with bot-detection avoidance. For best performance, start with 'basic' mode (fastest), then only escalate to 'stealth' or 'max-stealth' modes if basic mode fails. Returns matched content as 'METADATA: {json}\n\n[content]' where metadata includes match statistics and truncation information. Each matched content chunk is delimited with '॥๛॥' and prefixed with '[Position: start-end]' indicating its byte position in the original document, allowing targeted follow-up requests with s-fetch-page using specific start_index values. Args: url: URL to fetch search_pattern: Regular expression pattern to search for in the content mode: Fetching mode (basic, stealth, or max-stealth) format: Output format (html or markdown) max_length: Maximum number of characters to return. context_chars: Number of characters to include before and after each match

scrapling-fetch-mcp

An MCP server that helps AI assistants access text content from websites that implement bot detection, bridging the gap between what you can see in your browser and what the AI can access.

Intended Use

This tool is optimized for low-volume retrieval of documentation and reference materials (text/HTML only) from websites that implement bot detection. It has not been designed or tested for general-purpose site scraping or data harvesting.

Note: This project was developed in collaboration with Claude Sonnets 3.7 and 4.5, using LLM Context.

Installation

Requirements

Python 3.10+
uv package manager

Install

# Install scrapling-fetch-mcp
uv tool install scrapling-fetch-mcp

# Install browser binaries (REQUIRED - large downloads)
uvx --from scrapling-fetch-mcp scrapling install

Important: The browser installation downloads hundreds of MB of data and must complete before first use. If the MCP server times out on first use, the browsers may still be installing in the background. Wait a few minutes and try again.

Setup with Claude Desktop

Add this configuration to your Claude Desktop MCP settings:

MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "scrapling-fetch": {
      "command": "uvx",
      "args": ["scrapling-fetch-mcp"]
    }
  }
}

After updating the config, restart Claude Desktop.

What It Does

This MCP server provides two tools that Claude can use automatically when you ask it to fetch web content:

Page fetching: Retrieves complete web pages with support for pagination
Pattern extraction: Finds and extracts specific content using regex patterns

The AI decides which tool to use based on your request. You just ask naturally:

"Can you fetch the docs at https://example.com/api"
"Find all mentions of 'authentication' on that page"
"Get me the installation instructions from their homepage"

Protection Modes

The tools support three levels of bot detection bypass:

basic: Fast (1-2s), works for most sites
stealth: Moderate (3-8s), handles more protection
max-stealth: Maximum (10+s), for heavily protected sites

Claude automatically starts with basic mode and escalates if needed.

Tips for Best Results

Just ask naturally - Claude handles the technical details
For large pages, Claude can page through content automatically
For specific searches, mention what you're looking for and Claude will use pattern matching
The metadata returned helps Claude decide whether to page or search

Limitations

Designed for text content only (documentation, articles, references)
Not for high-volume scraping or data harvesting
May not work with sites requiring authentication
Performance varies by site complexity and protection level

Built with Scrapling for web scraping with bot detection bypass.

License

Apache 2.0

Alternatives

Browser Use

browser-use

79.9k

Browser Use lets LLMs and agents access and scrape any website in real time, making web scraping and web page scraping e

OfficialPopular

36616

FireCrawl

firecrawl

5.7k

Integrate FireCrawl for advanced web scraping to extract clean, structured data from complex websites—fast, scalable, an

OfficialRemotePopular

3214

Playwright

executeautomation

5.3k

Playwright automates web browsers for web scraping, scraping, and internet scraping, enabling you to scrape any website

CommunityPopular

84311

Deep Research MCP

u14app

4.5k

Deep Research MCP — an AI research assistant and LLM research tool for multi-step web search, content analysis, and synt

Community

219

Related Skills

Browse all skills

zotero

Manage Zotero reference libraries via the Web API. Search, list, add items by DOI/ISBN/PMID (with duplicate detection), delete/trash items, update metadata and tags, export in BibTeX/RIS/CSL-JSON, batch-add from files, check PDF attachments, cross-reference citations, find missing DOIs via CrossRef, and fetch open-access PDFs. Supports --json output for scripting. Use when the user asks about academic references, citation management, literature libraries, PDFs for papers, bibliography export, or Zotero specifically.

reddit-fetch

Fetch content from Reddit using Gemini CLI when WebFetch is blocked. Use when accessing Reddit URLs, researching topics on Reddit, or when Reddit returns 403/blocked errors.

brightdata-web-mcp

Search the web, scrape websites, extract structured data from URLs, and automate browsers using Bright Data's Web MCP. Use when fetching live web content, bypassing blocks/CAPTCHAs, getting product data from Amazon/eBay, social media posts, or when standard requests fail.

market-news-analyst

This skill should be used when analyzing recent market-moving news events and their impact on equity markets and commodities. Use this skill when the user requests analysis of major financial news from the past 10 days, wants to understand market reactions to monetary policy decisions (FOMC, ECB, BOJ), needs assessment of geopolitical events' impact on commodities, or requires comprehensive review of earnings announcements from mega-cap stocks. The skill automatically collects news using WebSearch/WebFetch tools and produces impact-ranked analysis reports. All analysis thinking and output are conducted in English.

tavily-usage

This skill should be used when user asks to "search the web", "fetch content from URL", "extract page content", "use Tavily search", "scrape this website", "get information from this link", or "web search for X".

recipe-to-list

Turn recipes into a Todoist Shopping list. Extract ingredients from recipe photos (Gemini Flash vision) or recipe web pages (search + fetch), then compare against the existing Shopping project with conservative synonym/overlap rules, skip pantry staples (salt/pepper), and sum quantities when units match. Also saves each cooked recipe into the workspace cookbook (recipes/).