Firecrawl

Name: Firecrawl
Rating: 4.9 (41847 reviews)
Author: mendableai

Official

Integrates with FireCrawl to scrape and extract structured data from complex websites using cloud browser sessions. Handles dynamic content, JavaScript rendering, and provides automatic retries with rate limiting.

Unlock powerful web data extraction with Firecrawl, turning any website into clean markdown or structured data. Firecrawl lets you crawl all accessible pages, scrape content in multiple formats, and extract structured data using AI-driven prompts and schemas. Its advanced features handle dynamic content, proxies, anti-bot measures, and media parsing, ensuring reliable and customizable data output. Whether mapping site URLs or batch scraping thousands of pages asynchronously, Firecrawl streamlines data gathering for AI applications, research, or automation with simple API calls and SDK support across multiple languages. Empower your projects with high-quality, LLM-ready web data.

89,5933,029 views6,253Local (stdio)

browser automation

GitHub Website

What it does

Scrape dynamic websites with JavaScript rendering
Crawl entire websites for content discovery
Extract structured data from web pages
Perform batch scraping operations
Search and filter web content
Run automated browser sessions

Best for

Data scientists gathering web datasetsResearchers collecting information from multiple sitesDevelopers building web scraping pipelinesContent creators aggregating online information

Cloud browser automationHandles JavaScript-heavy sitesBuilt-in rate limiting

About Firecrawl

Firecrawl is an official MCP server published by mendableai that provides AI assistants with tools and capabilities via the Model Context Protocol. Unlock AI-ready web data with Firecrawl: scrape any website, handle dynamic content, and automate web scraping for resea It is categorized under browser automation.

How to install

You can install Firecrawl in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

Firecrawl is released under the AGPL-3.0 license.

🔥 Firecrawl

Turn websites into LLM-ready data.

Firecrawl is an API that scrapes, crawls, and extracts structured data from any website, powering AI agents and apps with real-time context from the web.

Looking for our MCP? Check out the repo here.

This repository is in development, and we're still integrating custom modules into the mono repo. It's not fully ready for self-hosted deployment yet, but you can run it locally.

Pst. Hey, you, join our stargazers :)

Why Firecrawl?

LLM-ready output: Clean markdown, structured JSON, screenshots, HTML, and more
Industry-leading reliability: >80% coverage on benchmark evaluations, outperforming every other provider tested
Handles the hard stuff: Proxies, JavaScript rendering, and dynamic content that breaks other scrapers
Customization: Exclude tags, crawl behind auth walls, max depth, and more
Media parsing: Automatic text extraction from PDFs, DOCX, and images
Actions: Click, scroll, input, wait, and more before extracting
Batch processing: Scrape thousands of URLs asynchronously
Change tracking: Monitor website content changes over time

Quick Start

Make Your First API Request

curl -X POST 'https://api.firecrawl.dev/v2/scrape' \
  -H 'Authorization: Bearer fc-YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"url": "https://example.com"}'

Response:

{
  "success": true,
  "data": {
    "markdown": "# Example Domain\n\nThis domain is for use in illustrative examples...",
    "metadata": {
      "title": "Example Domain",
      "sourceURL": "https://example.com"
    }
  }
}

Feature Overview

Feature	Description
Scrape	Convert any URL to markdown, HTML, screenshots, or structured JSON
Search	Search the web and get full page content from results
Map	Discover all URLs on a website instantly
Crawl	Scrape all URLs of a website with a single request
Agent	Automated data gathering, just describe what you need

Scrape

Convert any URL to clean markdown, HTML, or structured data.

curl -X POST 'https://api.firecrawl.dev/v2/scrape' \
  -H 'Authorization: Bearer fc-YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://docs.firecrawl.dev",
    "formats": ["markdown", "html"]
  }'

Response:

{
  "success": true,
  "data": {
    "markdown": "# Firecrawl Docs\n\nTurn websites into LLM-ready data...",
    "html": "<!DOCTYPE html><html>...",
    "metadata": {
      "title": "Quickstart | Firecrawl",
      "description": "Firecrawl allows you to turn entire websites into LLM-ready markdown",
      "sourceURL": "https://docs.firecrawl.dev",
      "statusCode": 200
    }
  }
}

Extract Structured Data (JSON Mode)

Extract structured data using a schema:

from firecrawl import Firecrawl
from pydantic import BaseModel

app = Firecrawl(api_key="fc-YOUR_API_KEY")

class CompanyInfo(BaseModel):
    company_mission: str
    is_open_source: bool
    is_in_yc: bool

result = app.scrape(
    'https://firecrawl.dev',
    formats=[{"type": "json", "schema": CompanyInfo.model_json_schema()}]
)

print(result.json)

{"company_mission": "Turn websites into LLM-ready data", "is_open_source": true, "is_in_yc": true}

Or extract with just a prompt (no schema):

result = app.scrape(
    'https://firecrawl.dev',
    formats=[{"type": "json", "prompt": "Extract the company mission"}]
)

Scrape Formats

Available formats: markdown, html, rawHtml, screenshot, links, json, branding

Get a screenshot

doc = app.scrape("https://firecrawl.dev", formats=["screenshot"])
print(doc.screenshot)  # Base64 encoded image

Extract brand identity (colors, fonts, typography)

doc = app.scrape("https://firecrawl.dev", formats=["branding"])
print(doc.branding)  # {"colors": {...}, "fonts": [...], "typography": {...}}

Actions (Interact Before Scraping)

Click, type, scroll, and more before extracting:

doc = app.scrape(
    url="https://example.com/login",
    formats=["markdown"],
    actions=[
        {"type": "write", "text": "[email protected]"},
        {"type": "press", "key": "Tab"},
        {"type": "write", "text": "password"},
        {"type": "click", "selector": 'button[type="submit"]'},
        {"type": "wait", "milliseconds": 2000},
        {"type": "screenshot"}
    ]
)

Search

Search the web and optionally scrape the results.

curl -X POST 'https://api.firecrawl.dev/v2/search' \
  -H 'Authorization: Bearer fc-YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "query": "firecrawl web scraping",
    "limit": 5
  }'

Response:

{
  "success": true,
  "data": {
    "web": [
      {
        "url": "https://www.firecrawl.dev/",
        "title": "Firecrawl - The Web Data API for AI",
        "description": "The web crawling, scraping, and search API for AI.",
        "position": 1
      }
    ],
    "images": [...],
    "news": [...]
  }
}

Search with Content Scraping

Get the full content of search results:

from firecrawl import Firecrawl

firecrawl = Firecrawl(api_key="fc-YOUR_API_KEY")

results = firecrawl.search(
    "firecrawl web scraping",
    limit=3,
    scrape_options={
        "formats": ["markdown", "links"]
    }
)

Agent

The easiest way to get data from the web. Describe what you need, and our AI agent searches, navigates, and extracts it. No URLs required.

Agent is the evolution of our /extract endpoint: faster, more reliable, and doesn't require you to know the URLs upfront.

curl -X POST 'https://api.firecrawl.dev/v2/agent' \
  -H 'Authorization: Bearer fc-YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "prompt": "Find the pricing plans for Notion"
  }'

Response:

{
  "success": true,
  "data": {
    "result": "Notion offers the following pricing plans:\n\n1. Free - $0/month...\n2. Plus - $10/seat/month...\n3. Business - $18/seat/month...",
    "sources": ["https://www.notion.so/pricing"]
  }
}

Agent with Structured Output

Use a schema to get structured data:

from firecrawl import Firecrawl
from pydantic import BaseModel, Field
from typing import List, Optional

app = Firecrawl(api_key="fc-YOUR_API_KEY")

class Founder(BaseModel):
    name: str = Field(description="Full name of the founder")
    role: Optional[str] = Field(None, description="Role or position")

class FoundersSchema(BaseModel):
    founders: List[Founder] = Field(description="List of founders")

result = app.agent(
    prompt="Find the founders of Firecrawl",
    schema=FoundersSchema
)

print(result.data)

{
  "founders": [
    {"name": "Eric Ciarla", "role": "Co-founder"},
    {"name": "Nicolas Camara", "role": "Co-founder"},
    {"name": "Caleb Peffer", "role": "Co-founder"}
  ]
}

Agent with URLs (Optional)

Focus the agent on specific pages:

result = app.agent(
    urls=["https://docs.firecrawl.dev", "https://firecrawl.dev/pricing"],
    prompt="Compare the features and pricing information"
)

Model Selection

Choose between two models based on your needs:

Model	Cost	Best For
`spark-1-mini` (default)	60% cheaper	Most tasks
`spark-1-pro`	Standard	Complex research, critical extraction

result = app.agent(
    prompt="Compare enterprise features across Firecrawl, Apify, and ScrapingBee",
    model="spark-1-pro"
)

When to use Pro:

Comparing data across multiple websites
Extracting from sites with complex navigation or auth
Research tasks where the agent needs to explore multiple paths
Critical data where accuracy is paramount

Learn more about Spark models in our [Agen

README truncated. View full README on GitHub.

Alternatives

Browser Use

browser-use

79.9k

Browser Use lets LLMs and agents access and scrape any website in real time, making web scraping and web page scraping e

OfficialPopular

36613

Playwright Browser Automation

microsoft

28.4k

Enhance software testing with Playwright MCP: Fast, reliable browser automation, an innovative alternative to Selenium s

OfficialPopular

7.5k539

Chrome DevTools MCP

chromedevtools

28.1k

AI-driven control of live Chrome via Chrome DevTools: browser automation, debugging, performance analysis and network mo

OfficialPopular

49411

Chrome DevTools

chromedevtools

28.1k

Use Chrome DevTools for web site test speed, debugging, and performance analysis. The essential chrome developer tools f

OfficialPopular

3.8k170

Related Skills

Browse all skills

notebooklm

Query Google NotebookLM for source-grounded, citation-backed answers from uploaded documents. Reduces hallucinations through Gemini's document-only responses. Browser automation with library management and persistent authentication.

144

dev-browser

Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.

chrome-devtools

Browser automation, debugging, and performance analysis using Puppeteer CLI scripts. Use for automating browsers, taking screenshots, analyzing performance, monitoring network traffic, web scraping, form automation, and JavaScript debugging.

qa-tester

"Browser automation QA testing skill. Systematically tests web applications for functionality, security, and usability issues. Reports findings by severity (CRITICAL/HIGH/MEDIUM/LOW) with immediate alerts for critical failures."

browser-automation

Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from websites, take screenshots, fill forms, click buttons, or interact with web applications. Triggers include "browse", "navigate to", "go to website", "extract data from webpage", "screenshot", "web scraping", "fill out form", "click on", "search for on the web". When taking actions be as specific as possible.

playwright-browser-automation

Complete browser automation with Playwright. Auto-detects dev servers, writes clean test scripts to /tmp. Test pages, fill forms, take screenshots, check responsive design, validate UX, test login flows, check links, automate any browser task. Use when user wants to test websites, automate browser interactions, validate web functionality, or perform any browser-based testing.