GetWeb

Name: GetWeb
Rating: 4.4 (47 reviews)
Author: ivan-mezentsev

Searches the web using multiple search engines (DuckDuckGo, Google, Felo AI) and extracts/converts web content to markdown format. Includes caching and user agent rotation for reliable web scraping.

Integrates DuckDuckGo, Google Search, Felo AI, and Jina Reader APIs to provide web search, content extraction, and HTML-to-Markdown conversion with caching, user agent rotation, and configurable text filtering for reliable web research and information retrieval.

13489 views3Local (stdio)

search web

GitHub

What it does

Search DuckDuckGo and Google with customizable result counts
Extract and convert web pages to markdown format
Filter and clean extracted text content
Cache search results to reduce API calls
Rotate user agents to avoid blocking

Best for

Researchers gathering information from multiple sourcesContent creators needing web data in markdown formatDevelopers building search-powered applicationsAnyone automating web research workflows

Multiple search engines in one serverBuilt-in caching and anti-blocking featuresHTML-to-markdown conversion included

About GetWeb

GetWeb is a community-built MCP server published by ivan-mezentsev that provides AI assistants with tools and capabilities via the Model Context Protocol. GetWeb offers reliable web scraping and content extraction. Scrape any website with advanced internet scraping and filte It is categorized under search web.

How to install

You can install GetWeb in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

GetWeb is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

MCP-GetWeb

A Model Context Protocol (MCP) server that provides web search and content extraction capabilities.

Quick Start

{
  "mcpServers": {
    "getweb": {
      "command": "npx",
      "args": [
        "mcp-getweb"
      ],
      "type": "stdio",
      "env": {
        "GOOGLE_API_KEY": "XXXXXXXXX",
        "GOOGLE_SEARCH_ENGINE_ID": "XXXXXXXXX",
        "JINA_API_KEY": "jina_XXXXXXXXX",
        "LINKUP_API_KEY": "XXXXXXXXX",
        "EXA_API_KEY": "XXXXXXXXX"
      }
    }
  }
}

Features

1) DuckDuckGo Search (`duckduckgo-search`)

Search the web using DuckDuckGo with HTML scraping.

Parameters:

query (string, required): The search query
page (integer, optional): Page number (default: 1, min: 1)
numResults (integer, optional): Number of results to return (default: 10, min: 1, max: 20)

2) Google Search (`google-search`)

Search Google and return relevant results using the Programmable Search Engine.

Parameters:

query (string, required): Search query; quotes enable exact matches
num_results (integer, optional): Total results to return (default: 5, max: 10)
site (string, optional): Restrict to a specific site/domain (e.g., wikipedia.org)
language (string, optional): ISO 639-1 language code (e.g., en, es)
dateRestrict (string, optional): Date filter, e.g., d7, w4, m6, y1
exactTerms (string, optional): Exact phrase that must appear
resultType (string, optional): Result type: image|images|news|video|videos
page (integer, optional): Page number for pagination (default: 1, min: 1)
resultsPerPage (integer, optional): Results per page (default: 5, max: 10)
sort (string, optional): Sort order, relevance (default) or date

Note: Requires GOOGLE_API_KEY and GOOGLE_SEARCH_ENGINE_ID to be set.

3) Linkup Search (`linkup_search`)

Search the web via Linkup API and return relevant results in Markdown.

Parameters:

query (string, required): Natural-language search query
onlySearchTheseDomains (array of strings, optional): Restrict results to specific domains
dateFilter (object, optional): Date range filter
- fromDate (string, optional): Start date in YYYY-MM-DD
- toDate (string, optional): End date in YYYY-MM-DD
maxResults (integer, optional): Maximum number of results to return (default: 5, min: 1)

Note: Requires LINKUP_API_KEY to be set.

4) Exa Search (`exa_search`)

Search the web via Exa API and return relevant results in Markdown.

Parameters:

query (string, required): Natural-language search query
maxResults (integer, optional): Number of results to return (default: 10, max: 25)
publishedDateRange (object, optional): Published date range filter
- fromDate (string, optional): Start date, RFC3339 (e.g., 2024-02-09T00:00:00.000Z) or YYYY-MM-DD
- toDate (string, optional): End date, RFC3339 (e.g., 2024-02-09T00:00:00.000Z) or YYYY-MM-DD
crawlDateRange (object, optional): Crawl date range filter
- fromDate (string, optional): Start date, RFC3339 or YYYY-MM-DD
- toDate (string, optional): End date, RFC3339 or YYYY-MM-DD
userLocation (string, optional): Two-letter ISO country code (e.g., US)
includeText (string, optional): Exact phrase that must appear in the webpage text (max 5 words)
excludeText (string, optional): Exact phrase that must not appear in the webpage text (max 5 words)
domain (string, optional): Restrict results to a single domain (e.g., arxiv.org)

Note: Requires EXA_API_KEY to be set.

5) Felo AI Search (`felo-search`)

AI-powered search with contextual responses for up-to-date technical information (releases, advisories, migrations, benchmarks, community insights).

Parameters:

query (string, required): The search query or prompt
stream (boolean, optional): Whether to stream the response (default: false)

6) URL Content Fetcher (`fetch-url`)

Fetch the clean content of a URL and return it as text.

Parameters:

url (string, required): The URL to fetch
maxLength (integer, optional): Maximum content length (default: 30000, min: 1000, max: 500000)
extractMainContent (boolean, optional): Attempt to extract main content when HTML (default: true)

7) URL Metadata Extractor (`url-metadata`)

Extract metadata (title, description, image, favicon) from a URL.

Parameters:

url (string, required): The URL to extract metadata from

8) URL Fetch to Markdown (`url-fetch`)

Fetch web pages and convert them to Markdown. Handles HTML, plaintext, and JSON (pretty-printed in a fenced block).

Parameters:

url (string, required): The URL to fetch and convert to Markdown

9) Jina Reader (`jina-reader`)

Retrieve LLM-friendly content from a URL using Jina r.reader with optional summaries and formats.

Parameters:

url (string, required): The URL to fetch and parse
maxLength (integer, optional): Maximum output length (default: 10000, min: 1000, max: 50000)
withLinksummary (boolean, optional): Include links summary (default: false)
withImagesSummary (boolean, optional): Include images summary (default: false)
withGeneratedAlt (boolean, optional): Generate alt text for images (default: false)
returnFormat (string, optional): markdown (default) | html | text | screenshot | pageshot
noCache (boolean, optional): Bypass cache (default: false)
timeout (integer, optional): Max seconds to wait (default: 10, min: 5, max: 30)

Note: Requires JINA_API_KEY to be set.

Acknowledgments

Model Context Protocol specification by Anthropic
DuckDuckGo for providing a privacy-focused web search experience
Google Programmable Search Engine and Custom Search JSON API
Linkup API for high-quality web search results
Exa API for fast neural web search
Jina AI r.reader API for high-quality content extraction
Felo AI for up-to-date, developer-focused search insights
Rust ecosystem and crates that power this server:
- tokio, reqwest, serde, serde_json, tracing, tracing-subscriber, clap
- html2text, chardetng, encoding_rs, scraper, html5ever, markup5ever_rcdom, regex, once_cell, futures, async-stream
- url, uuid, thiserror, tokio-util, rand, urlencoding
The broader MCP community for guidance, examples, and discussions

Support

If you encounter any issues or have questions, please open an issue on GitHub.

Alternatives

Browser Use

browser-use

79.9k

Browser Use lets LLMs and agents access and scrape any website in real time, making web scraping and web page scraping e

OfficialPopular

36616

FireCrawl

firecrawl

5.7k

Integrate FireCrawl for advanced web scraping to extract clean, structured data from complex websites—fast, scalable, an

OfficialRemotePopular

3214

Playwright

executeautomation

5.3k

Playwright automates web browsers for web scraping, scraping, and internet scraping, enabling you to scrape any website

CommunityPopular

84311

Deep Research MCP

u14app

4.5k

Deep Research MCP — an AI research assistant and LLM research tool for multi-step web search, content analysis, and synt

Community

219

Related Skills

Browse all skills

google-official-seo-guide

Official Google SEO guide covering search optimization, best practices, Search Console, crawling, indexing, and improving website search visibility based on official Google documentation

119

ux-writing

Create user-centered, accessible interface copy (microcopy) for digital products including buttons, labels, error messages, notifications, forms, onboarding, empty states, success messages, and help text. Use when writing or editing any text that appears in apps, websites, or software interfaces, designing conversational flows, establishing voice and tone guidelines, auditing product content for consistency and usability, reviewing UI strings, or improving existing interface copy. Applies UX writing best practices based on four quality standards — purposeful, concise, conversational, and clear. Includes accessibility guidelines, research-backed benchmarks (sentence length, comprehension rates, reading levels), expanded error patterns, tone adaptation frameworks, and comprehensive reference materials.

last30days

Research a topic from the last 30 days on Reddit + X + Web, become an expert, and write copy-paste-ready prompts for the user's target tool.

browser-automation

Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from websites, take screenshots, fill forms, click buttons, or interact with web applications. Triggers include "browse", "navigate to", "go to website", "extract data from webpage", "screenshot", "web scraping", "fill out form", "click on", "search for on the web". When taking actions be as specific as possible.

seo-optimizer

Search Engine Optimization specialist for content strategy, technical SEO, keyword research, and ranking improvements. Use when optimizing website content, improving search rankings, conducting keyword analysis, or implementing SEO best practices. Expert in on-page SEO, meta tags, schema markup, and Core Web Vitals.

web-research

Use this skill for requests related to web research; it provides a structured approach to conducting comprehensive web research

What it does

Best for

About GetWeb

How to install

License

MCP-GetWeb

Quick Start

Features

1) DuckDuckGo Search (duckduckgo-search)

2) Google Search (google-search)

3) Linkup Search (linkup_search)

4) Exa Search (exa_search)

5) Felo AI Search (felo-search)

6) URL Content Fetcher (fetch-url)

7) URL Metadata Extractor (url-metadata)

8) URL Fetch to Markdown (url-fetch)

9) Jina Reader (jina-reader)

Acknowledgments

Support

Alternatives

Browser Use

FireCrawl

Playwright

Deep Research MCP

Related Skills

1) DuckDuckGo Search (`duckduckgo-search`)

2) Google Search (`google-search`)

3) Linkup Search (`linkup_search`)

4) Exa Search (`exa_search`)

5) Felo AI Search (`felo-search`)

6) URL Content Fetcher (`fetch-url`)

7) URL Metadata Extractor (`url-metadata`)

8) URL Fetch to Markdown (`url-fetch`)

9) Jina Reader (`jina-reader`)