JinaAI

Name: JinaAI
Rating: 4.8 (36 reviews)
Author: spences10

Extracts and converts web content from URLs into clean, structured text that's optimized for LLM processing using Jina.ai's Reader API.

Extracts and processes web content for efficient parsing and analysis of online information

31384 views6Local (stdio)

search web

GitHub

What it does

Extract text content from any URL
Convert web pages to LLM-friendly format
Preserve document structure during extraction
Process various content types including documentation

Best for

Analyzing web documentation and articlesProcessing online content for AI workflowsConverting web pages for LLM analysis

Repository deprecated - use mcp-omnisearch insteadRequires Jina.ai API key

About JinaAI

JinaAI is a community-built MCP server published by spences10 that provides AI assistants with tools and capabilities via the Model Context Protocol. JinaAI offers advanced web scraping tools and software for efficient extraction and parsing of web page content and data. It is categorized under search web.

How to install

You can install JinaAI in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

JinaAI is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

mcp-jinaai-reader

⚠️ Notice

This repository is no longer maintained.

The functionality of this tool is now available in mcp-omnisearch, which combines multiple MCP tools in one unified package.

Please use mcp-omnisearch instead.

A Model Context Protocol (MCP) server for integrating Jina.ai's Reader API with LLMs. This server provides efficient and comprehensive web content extraction capabilities, optimized for documentation and web content analysis.

Features

📚 Advanced web content extraction through Jina.ai Reader API
🚀 Fast and efficient content retrieval
📄 Complete text extraction with preserved structure
🔄 Clean format optimized for LLMs
🌐 Support for various content types including documentation
🏗️ Built on the Model Context Protocol

Configuration

This server requires configuration through your MCP client. Here are examples for different environments:

Cline Configuration

Add this to your Cline MCP settings:

{
	"mcpServers": {
		"jinaai-reader": {
			"command": "node",
			"args": ["-y", "mcp-jinaai-reader"],
			"env": {
				"JINAAI_API_KEY": "your-jinaai-api-key"
			}
		}
	}
}

Claude Desktop with WSL Configuration

For WSL environments, add this to your Claude Desktop configuration:

{
	"mcpServers": {
		"jinaai-reader": {
			"command": "wsl.exe",
			"args": [
				"bash",
				"-c",
				"JINAAI_API_KEY=your-jinaai-api-key npx mcp-jinaai-reader"
			]
		}
	}
}

Environment Variables

The server requires the following environment variable:

JINAAI_API_KEY: Your Jina.ai API key (required)

API

The server implements a single MCP tool with configurable parameters:

read_url

Convert any URL to LLM-friendly text using Jina.ai Reader.

Parameters:

url (string, required): URL to process
no_cache (boolean, optional): Bypass cache for fresh results. Defaults to false
format (string, optional): Response format ("json" or "stream"). Defaults to "json"
timeout (number, optional): Maximum time in seconds to wait for webpage load
target_selector (string, optional): CSS selector to focus on specific elements
wait_for_selector (string, optional): CSS selector to wait for specific elements
remove_selector (string, optional): CSS selector to exclude specific elements
with_links_summary (boolean, optional): Gather all links at the end of response
with_images_summary (boolean, optional): Gather all images at the end of response
with_generated_alt (boolean, optional): Add alt text to images lacking captions
with_iframe (boolean, optional): Include iframe content in response

Development

Setup

Clone the repository
Install dependencies:

npm install

Build the project:

npm run build

Run in development mode:

npm run dev

Publishing

Update version in package.json
Build the project:

npm run build

Publish to npm:

npm publish

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see the LICENSE file for details.

Acknowledgments

Built on the Model Context Protocol
Powered by Jina.ai Reader API

Alternatives

Browser Use

browser-use

79.9k

Browser Use lets LLMs and agents access and scrape any website in real time, making web scraping and web page scraping effortless via API.

OfficialPopular

36630

FireCrawl

firecrawl

5.7k

Integrate FireCrawl for advanced web scraping to extract clean, structured data from complex websites—fast, scalable, and reliable.

OfficialRemotePopular

3216

Playwright

executeautomation

5.3k

Playwright automates web browsers for web scraping, scraping, and internet scraping, enabling you to scrape any website efficiently.

CommunityPopular

84312

Deep Research MCP

u14app

4.5k

Deep Research MCP — an AI research assistant and LLM research tool for multi-step web search, content analysis, and synthesis.

Community

315

Related Skills

Browse all skills

google-official-seo-guide

Official Google SEO guide covering search optimization, best practices, Search Console, crawling, indexing, and improving website search visibility based on official Google documentation

149

core-web-vitals

Optimize Core Web Vitals (LCP, INP, CLS) for better page experience and search ranking. Use when asked to "improve Core Web Vitals", "fix LCP", "reduce CLS", "optimize INP", "page experience optimization", or "fix layout shifts".

ux-writing

Create user-centered, accessible interface copy (microcopy) for digital products including buttons, labels, error messages, notifications, forms, onboarding, empty states, success messages, and help text. Use when writing or editing any text that appears in apps, websites, or software interfaces, designing conversational flows, establishing voice and tone guidelines, auditing product content for consistency and usability, reviewing UI strings, or improving existing interface copy. Applies UX writing best practices based on four quality standards — purposeful, concise, conversational, and clear. Includes accessibility guidelines, research-backed benchmarks (sentence length, comprehension rates, reading levels), expanded error patterns, tone adaptation frameworks, and comprehensive reference materials.

web-search

This skill should be used when users need to search the web for information, find current content, look up news articles, search for images, or find videos. It uses DuckDuckGo's search API to return results in clean, formatted output (text, markdown, or JSON). Use for research, fact-checking, finding recent information, or gathering web resources.

browser-automation

Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from websites, take screenshots, fill forms, click buttons, or interact with web applications. Triggers include "browse", "navigate to", "go to website", "extract data from webpage", "screenshot", "web scraping", "fill out form", "click on", "search for on the web". When taking actions be as specific as possible.

zotero

Manage Zotero reference libraries via the Web API. Search, list, add items by DOI/ISBN/PMID (with duplicate detection), delete/trash items, update metadata and tags, export in BibTeX/RIS/CSL-JSON, batch-add from files, check PDF attachments, cross-reference citations, find missing DOIs via CrossRef, and fetch open-access PDFs. Supports --json output for scripting. Use when the user asks about academic references, citation management, literature libraries, PDFs for papers, bibliography export, or Zotero specifically.