
PDF Reader
Extracts text, metadata, and images from PDF files, supporting both local files and remote URLs. Processes multiple PDFs with parallel processing and natural reading order based on Y-coordinates.
Securely extracts text, metadata, and page information from PDF files within a project directory using pdfjs-dist for both local files and remote URLs.
What it does
- Extract text from PDF pages
- Read PDF metadata and document info
- Process remote PDFs via URLs
- Extract specific page ranges
- Retrieve images from PDFs
- Handle multiple PDFs simultaneously
Best for
About PDF Reader
PDF Reader is a community-built MCP server published by sylphxltd that provides AI assistants with tools and capabilities via the Model Context Protocol. Securely extract text, metadata, & pages from PDFs using Adobe Acrobat PDF editor software for local & remote files. It is categorized under file systems, analytics data. This server exposes 1 tool that AI clients can invoke during conversations and coding sessions.
How to install
You can install PDF Reader in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.
License
PDF Reader is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.
Tools (1)
Reads content/metadata from one or more PDFs (local/URL). Each source can specify pages to extract.
π @sylphx/pdf-reader-mcp
Production-ready PDF processing server for AI agents
5-10x faster parallel processing β’ Y-coordinate content ordering β’ 94%+ test coverage β’ 103 tests passing
π Overview
PDF Reader MCP is a production-ready Model Context Protocol server that empowers AI agents with enterprise-grade PDF processing capabilities. Extract text, images, and metadata with unmatched performance and reliability.
The Problem:
// Traditional PDF processing
- Sequential page processing (slow)
- No natural content ordering
- Complex path handling
- Poor error isolation
The Solution:
// PDF Reader MCP
- 5-10x faster parallel processing β‘
- Y-coordinate based ordering π
- Flexible path support (absolute/relative) π―
- Per-page error resilience π‘οΈ
- 94%+ test coverage β
Result: Production-ready PDF processing that scales.
β‘ Key Features
Performance
- π 5-10x faster than sequential with automatic parallelization
- β‘ 12,933 ops/sec error handling, 5,575 ops/sec text extraction
- π¨ Process 50-page PDFs in seconds with multi-core utilization
- π¦ Lightweight with minimal dependencies
Developer Experience
- π― Path Flexibility - Absolute & relative paths, Windows/Unix support (v1.3.0)
- πΌοΈ Smart Ordering - Y-coordinate based content preserves document layout
- π‘οΈ Type Safe - Full TypeScript with strict mode enabled
- π Battle-tested - 103 tests, 94%+ coverage, 98%+ function coverage
- π¨ Simple API - Single tool handles all operations elegantly
π Performance Benchmarks
Real-world performance from production testing:
| Operation | Ops/sec | Performance | Use Case |
|---|---|---|---|
| Error handling | 12,933 | β‘β‘β‘β‘β‘ | Validation & safety |
| Extract full text | 5,575 | β‘β‘β‘β‘ | Document analysis |
| Extract page | 5,329 | β‘β‘β‘β‘ | Single page ops |
| Multiple pages | 5,242 | β‘β‘β‘β‘ | Batch processing |
| Metadata only | 4,912 | β‘β‘β‘ | Quick inspection |
Parallel Processing Speedup
| Document | Sequential | Parallel | Speedup |
|---|---|---|---|
| 10-page PDF | ~2s | ~0.3s | 5-8x faster |
| 50-page PDF | ~10s | ~1s | 10x faster |
| 100+ pages | ~20s | ~2s | Linear scaling with CPU cores |
Benchmarks vary based on PDF complexity and system resources.
π¦ Installation
Claude Code
claude mcp add pdf-reader -- npx @sylphx/pdf-reader-mcp
Claude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"pdf-reader": {
"command": "npx",
"args": ["@sylphx/pdf-reader-mcp"]
}
}
}
π Config file locations
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
VS Code
code --add-mcp '{"name":"pdf-reader","command":"npx","args":["@sylphx/pdf-reader-mcp"]}'
Cursor
- Open Settings β MCP β Add new MCP Server
- Select Command type
- Enter:
npx @sylphx/pdf-reader-mcp
Windsurf
Add to your Windsurf MCP config:
{
"mcpServers": {
"pdf-reader": {
"command": "npx",
"args": ["@sylphx/pdf-reader-mcp"]
}
}
}
Cline
Add to Cline's MCP settings:
{
"mcpServers": {
"pdf-reader": {
"command": "npx",
"args": ["@sylphx/pdf-reader-mcp"]
}
}
}
Warp
- Go to Settings β AI β Manage MCP Servers β Add
- Command:
npx, Args:@sylphx/pdf-reader-mcp
Smithery (One-click)
npx -y @smithery/cli install @sylphx/pdf-reader-mcp --client claude
Manual Installation
# Quick start - zero installation
npx @sylphx/pdf-reader-mcp
# Or install globally
npm install -g @sylphx/pdf-reader-mcp
π― Quick Start
Basic Usage
{
"sources": [{
"path": "documents/report.pdf"
}],
"include_full_text": true,
"include_metadata": true,
"include_page_count": true
}
Result:
- β Full text content extracted
- β PDF metadata (author, title, dates)
- β Total page count
- β Structural sharing - unchanged parts preserved
Extract Specific Pages
{
"sources": [{
"path": "documents/manual.pdf",
"pages": "1-5,10,15-20"
}],
"include_full_text": true
}
Absolute Paths (v1.3.0+)
// Windows - Both formats work!
{
"sources": [{
"path": "C:\\Users\\John\\Documents\\report.pdf"
}],
"include_full_text": true
}
// Unix/Mac
{
"sources": [{
"path": "/home/user/documents/contract.pdf"
}],
"include_full_text": true
}
No more "Absolute paths are not allowed" errors!
Extract Images with Natural Ordering
{
"sources": [{
"path": "presentation.pdf",
"pages": [1, 2, 3]
}],
"include_images": true,
"include_full_text": true
}
Response includes:
- Text and images in exact document order (Y-coordinate sorted)
- Base64-encoded images with metadata (width, height, format)
- Natural reading flow preserved for AI comprehension
Batch Processing
{
"sources": [
{ "path": "C:\\Reports\\Q1.pdf", "pages": "1-10" },
{ "path": "/home/user/Q2.pdf", "pages": "1-10" },
{ "url": "https://example.com/Q3.pdf" }
],
"include_full_text": true
}
β‘ All PDFs processed in parallel automatically!
β¨ Features
Core Capabilities
- β Text Extraction - Full document or specific pages with intelligent parsing
- β Image Extraction - Base64-encoded with complete metadata (width, height, format)
- β Content Ordering - Y-coordinate based layout preservation for natural reading flow
- β Metadata Extraction - Author, title, creation date, and custom properties
- β Page Counting - Fast enumeration without loading full content
- β Dual Sources - Local files (absolute or relative paths) and HTTP/HTTPS URLs
- β Batch Processing - Multiple PDFs processed concurrently
Advanced Features
- β‘ 5-10x Performance - Parallel page processing with Promise.all
- π― Smart Pagination - Extract ranges like "1-5,10-15,20"
- πΌοΈ Multi-Format Images - RGB, RGBA, Grayscale with automatic detection
- π‘οΈ Path Flexibility - Windows, Unix, and relative paths all supported (v1.3.0)
- π Error Resilience - Per-page error isolation with detailed messages
- π Large File Support - Efficient streaming and memory management
- π Type Safe - Full TypeScript with strict mode enabled
π What's New in v1.3.0
π Absolute Paths Now Supported!
// β
Windows
{ "path": "C:\\Users\\John\\Documents\\report.pdf" }
{ "path": "C:/Users/John/Documents/report.pdf" }
// β
Unix/Mac
{ "path": "/home/john/documents/report.pdf" }
{ "path": "/Users/john/Documents/report.pdf" }
// β
Relative (still works)
{ "path": "documents/report.pdf" }
Other Improvements:
- π Fixed Zod validation error handling
- π¦ Updated all dependencies to latest versions
- β 103 tests passing, 94%+ coverage maintained
π View Full Changelog
v1.2.0 - Content Ordering
- Y-coordinate based text and image ordering
- Natural reading flow for AI models
- Intelligent line grouping
v1.1.0 - Image Extraction & Performance
- Base64-encoded image extraction
- 10x speedup with parallel processing
- Comprehensive test coverage (94%+)
π API Reference
read_pdf Tool
The single tool that handles all PDF operations.
Parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
sources | Array | List of PDF sources to process | Required |
include_full_text | boolean | Extract full text content | false |
include_metadata | boolean | Extract PDF metadata | true |
include_page_count | boolean | Include total page count | true |
include_images | boolean | Extract embedded images | false |
Source Object
{
path?: string; // Local file path (absolute or relative)
url?: string; // HTTP/HTTPS URL to PDF
pages?: string | number[]; // Pages to extract: "1-5,10" or [1,2,3]
}
Examples
Metadata only (fast):
{
"sources": [{ "path": "large.pdf" }],
"include_metadata": true,
"include_page_count": true,
"include_full_text": false
}
From URL:
{
"sources": [{
"url": "https://arxiv.org/pdf/2301.00001.pdf"
}],
"include_full_text": true
}
Page ranges:
{
---
*README truncated. [View full README on GitHub](https://github.com/sylphxltd/pdf-reader-mcp).*
Alternatives
Related Skills
Browse all skillsBuild document Q&A and searchable knowledge bases with Google Gemini File Search - fully managed RAG with automatic chunking, embeddings, and citations. Upload 100+ file formats (PDF, Word, Excel, code), configure semantic search, and query with natural language.Use when: building document Q&A systems, creating searchable knowledge bases, implementing semantic search without managing embeddings, indexing large document collections (100+ formats), or troubleshooting document immutability errors (delete+re-upload required), storage quota issues (3x input size for embeddings), chunking configuration (500 tokens/chunk recommended), metadata limits (20 key-value pairs max), indexing cost surprises ($0.15/1M tokens one-time), operation polling timeouts (wait for done: true), force delete errors, or model compatibility (Gemini 2.5 Pro/Flash only).
Convert laboratory instrument output files (PDF, CSV, Excel, TXT) to Allotrope Simple Model (ASM) JSON format or flattened 2D CSV. Use this skill when scientists need to standardize instrument data for LIMS systems, data lakes, or downstream analysis. Supports auto-detection of instrument types. Outputs include full ASM JSON, flattened CSV for easy import, and exportable Python code for data engineers. Common triggers include converting instrument files, standardizing lab data, preparing data for upload to LIMS/ELN systems, or generating parser code for production pipelines.
Azure Data Lake Storage Gen2 SDK for Python. Use for hierarchical file systems, big data analytics, and file/directory operations. Triggers: "data lake", "DataLakeServiceClient", "FileSystemClient", "ADLS Gen2", "hierarchical namespace".
Convert various file formats (PDF, Office documents, images, audio, web content, structured data) to Markdown optimized for LLM processing. Use when converting documents to markdown, extracting text from PDFs/Office files, transcribing audio, performing OCR on images, extracting YouTube transcripts, or processing batches of files. Supports 20+ formats including DOCX, XLSX, PPTX, PDF, HTML, EPUB, CSV, JSON, images with OCR, and audio with transcription.
Manage Zotero reference libraries via the Web API. Search, list, add items by DOI/ISBN/PMID (with duplicate detection), delete/trash items, update metadata and tags, export in BibTeX/RIS/CSL-JSON, batch-add from files, check PDF attachments, cross-reference citations, find missing DOIs via CrossRef, and fetch open-access PDFs. Supports --json output for scripting. Use when the user asks about academic references, citation management, literature libraries, PDFs for papers, bibliography export, or Zotero specifically.
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.