firecrawl-scraper

144views

18installs

Scrape and extract web content, convert HTML to markdown, and bypass bot protection for dynamic sites using Firecrawl API.

Install

mkdir -p .claude/skills/firecrawl-scraper && curl -L -o skill.zip "https://mcp.directory/api/skills/download/310" && unzip -o skill.zip -d .claude/skills/firecrawl-scraper && rm skill.zip

Installs to .claude/skills/firecrawl-scraper

About this skill

Firecrawl Web Scraper Skill

Status: Production Ready ✅ Last Updated: 2025-10-24 Official Docs: https://docs.firecrawl.dev API Version: v2

What is Firecrawl?

Firecrawl is a Web Data API for AI that turns entire websites into LLM-ready markdown or structured data. It handles:

JavaScript rendering - Executes client-side JavaScript to capture dynamic content
Anti-bot bypass - Gets past CAPTCHA and bot detection systems
Format conversion - Outputs as markdown, JSON, or structured data
Screenshot capture - Saves visual representations of pages
Browser automation - Full headless browser capabilities

API Endpoints

1. `/v2/scrape` - Single Page Scraping

Scrapes a single webpage and returns clean, structured content.

Use Cases:

Extract article content
Get product details
Scrape specific pages
Convert HTML to markdown

Key Options:

formats: ["markdown", "html", "screenshot"]
onlyMainContent: true/false (removes nav, footer, ads)
waitFor: milliseconds to wait before scraping
actions: browser automation actions (click, scroll, etc.)

2. `/v2/crawl` - Full Site Crawling

Crawls all accessible pages from a starting URL.

Use Cases:

Index entire documentation sites
Archive website content
Build knowledge bases
Scrape multi-page content

Key Options:

limit: max pages to crawl
maxDepth: how many links deep to follow
allowedDomains: restrict to specific domains
excludePaths: skip certain URL patterns

3. `/v2/map` - URL Discovery

Maps all URLs on a website without scraping content.

Use Cases:

Find sitemap
Discover all pages
Plan crawling strategy
Audit website structure

4. `/v2/extract` - Structured Data Extraction

Uses AI to extract specific data fields from pages.

Use Cases:

Extract product prices and names
Parse contact information
Build structured datasets
Custom data schemas

Key Options:

schema: Zod or JSON schema defining desired structure
systemPrompt: guide AI extraction behavior

Authentication

Firecrawl requires an API key for all requests.

Get API Key

Sign up at https://www.firecrawl.dev
Go to dashboard → API Keys
Copy your API key (starts with fc-)

Store Securely

NEVER hardcode API keys in code!

# .env file
FIRECRAWL_API_KEY=fc-your-api-key-here

# .env.local (for local development)
FIRECRAWL_API_KEY=fc-your-api-key-here

Python SDK Usage

Installation

pip install firecrawl-py

Latest Version: firecrawl-py v4.5.0+

Basic Scrape

import os
from firecrawl import FirecrawlApp

# Initialize client
app = FirecrawlApp(api_key=os.environ.get("FIRECRAWL_API_KEY"))

# Scrape a single page
result = app.scrape_url(
    url="https://example.com/article",
    params={
        "formats": ["markdown", "html"],
        "onlyMainContent": True
    }
)

# Access markdown content
markdown = result.get("markdown")
print(markdown)

Crawl Multiple Pages

import os
from firecrawl import FirecrawlApp

app = FirecrawlApp(api_key=os.environ.get("FIRECRAWL_API_KEY"))

# Start crawl
crawl_result = app.crawl_url(
    url="https://docs.example.com",
    params={
        "limit": 100,
        "scrapeOptions": {
            "formats": ["markdown"]
        }
    },
    poll_interval=5  # Check status every 5 seconds
)

# Process results
for page in crawl_result.get("data", []):
    url = page.get("url")
    markdown = page.get("markdown")
    print(f"Scraped: {url}")

Extract Structured Data

import os
from firecrawl import FirecrawlApp

app = FirecrawlApp(api_key=os.environ.get("FIRECRAWL_API_KEY"))

# Define schema
schema = {
    "type": "object",
    "properties": {
        "company_name": {"type": "string"},
        "product_price": {"type": "number"},
        "availability": {"type": "string"}
    },
    "required": ["company_name", "product_price"]
}

# Extract data
result = app.extract(
    urls=["https://example.com/product"],
    params={
        "schema": schema,
        "systemPrompt": "Extract product information from the page"
    }
)

print(result)

TypeScript/Node.js SDK Usage

Installation

npm install @mendable/firecrawl-js
# or
pnpm add @mendable/firecrawl-js
# or use the unscoped package:
npm install firecrawl

Latest Version: @mendable/firecrawl-js v4.4.1+ (or firecrawl v4.4.1+)

Basic Scrape

import FirecrawlApp from '@mendable/firecrawl-js';

// Initialize client
const app = new FirecrawlApp({
  apiKey: process.env.FIRECRAWL_API_KEY
});

// Scrape a single page
const result = await app.scrapeUrl('https://example.com/article', {
  formats: ['markdown', 'html'],
  onlyMainContent: true
});

// Access markdown content
const markdown = result.markdown;
console.log(markdown);

Crawl Multiple Pages

import FirecrawlApp from '@mendable/firecrawl-js';

const app = new FirecrawlApp({
  apiKey: process.env.FIRECRAWL_API_KEY
});

// Start crawl
const crawlResult = await app.crawlUrl('https://docs.example.com', {
  limit: 100,
  scrapeOptions: {
    formats: ['markdown']
  }
});

// Process results
for (const page of crawlResult.data) {
  console.log(`Scraped: ${page.url}`);
  console.log(page.markdown);
}

Extract Structured Data with Zod

import FirecrawlApp from '@mendable/firecrawl-js';
import { z } from 'zod';

const app = new FirecrawlApp({
  apiKey: process.env.FIRECRAWL_API_KEY
});

// Define schema with Zod
const schema = z.object({
  company_name: z.string(),
  product_price: z.number(),
  availability: z.string()
});

// Extract data
const result = await app.extract({
  urls: ['https://example.com/product'],
  schema: schema,
  systemPrompt: 'Extract product information from the page'
});

console.log(result);

Common Use Cases

1. Documentation Scraping

Scenario: Convert entire documentation site to markdown for RAG/chatbot

app = FirecrawlApp(api_key=os.environ.get("FIRECRAWL_API_KEY"))

docs = app.crawl_url(
    url="https://docs.myapi.com",
    params={
        "limit": 500,
        "scrapeOptions": {
            "formats": ["markdown"],
            "onlyMainContent": True
        },
        "allowedDomains": ["docs.myapi.com"]
    }
)

# Save to files
for page in docs.get("data", []):
    filename = page["url"].replace("https://", "").replace("/", "_") + ".md"
    with open(f"docs/{filename}", "w") as f:
        f.write(page["markdown"])

2. Product Data Extraction

Scenario: Extract structured product data for e-commerce

const schema = z.object({
  title: z.string(),
  price: z.number(),
  description: z.string(),
  images: z.array(z.string()),
  in_stock: z.boolean()
});

const products = await app.extract({
  urls: productUrls,
  schema: schema,
  systemPrompt: 'Extract all product details including price and availability'
});

3. News Article Scraping

Scenario: Extract clean article content without ads/navigation

article = app.scrape_url(
    url="https://news.com/article",
    params={
        "formats": ["markdown"],
        "onlyMainContent": True,
        "removeBase64Images": True
    }
)

# Get clean markdown
content = article.get("markdown")

Error Handling

Python

from firecrawl import FirecrawlApp
from firecrawl.exceptions import FirecrawlException

app = FirecrawlApp(api_key=os.environ.get("FIRECRAWL_API_KEY"))

try:
    result = app.scrape_url("https://example.com")
except FirecrawlException as e:
    print(f"Firecrawl error: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

TypeScript

import FirecrawlApp from '@mendable/firecrawl-js';

const app = new FirecrawlApp({
  apiKey: process.env.FIRECRAWL_API_KEY
});

try {
  const result = await app.scrapeUrl('https://example.com');
} catch (error) {
  if (error.response) {
    // API error
    console.error('API Error:', error.response.data);
  } else {
    // Network or other error
    console.error('Error:', error.message);
  }
}

Rate Limits & Best Practices

Rate Limits

Free tier: 500 credits/month
Paid tiers: Higher limits based on plan
Credits consumed vary by endpoint and options

Best Practices

Use onlyMainContent: true to reduce credits and get cleaner data
Set reasonable limits on crawls to avoid excessive costs
Handle retries with exponential backoff for transient errors
Cache results locally to avoid re-scraping same content
Use map endpoint first to plan crawling strategy
Batch extract calls when processing multiple URLs
Monitor credit usage in dashboard

Cloudflare Workers Integration

⚠️ Important: SDK Compatibility

The Firecrawl SDK cannot run in Cloudflare Workers due to Node.js dependencies (specifically axios which uses Node.js http module). Workers require Web Standard APIs.

✅ Use the direct REST API with fetch instead (see example below).

Alternative: Self-host with workers-firecrawl - a Workers-native implementation (requires Workers Paid Plan, only implements /search endpoint).

Workers Example: Direct REST API

This example uses the fetch API to call Firecrawl directly - works perfectly in Cloudflare Workers:

interface Env {
  FIRECRAWL_API_KEY: string;
  SCRAPED_CACHE?: KVNamespace; // Optional: for caching results
}

interface FirecrawlScrapeResponse {
  success: boolean;
  data: {
    markdown?: string;
    html?: string;
    metadata: {
      title?: string;
      description?: string;
      language?: string;
      sourceURL: string;
    };
  };
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    if (request.method !== 'POST') {
      retu

---

*Content truncated.*

More by jackspace

View all skills by jackspace →

windows-expert

jackspace

Expert guidance for Windows, PowerShell, WSL interop, and cross-platform development

9311

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

1,5711,369

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

1,1161,191

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

1,4181,109

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

1,194747

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

1,154684

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

1,312614

Related MCP Servers

Browse all servers

webclaw

Fast, local-first web content extraction for LLMs. Scrape, crawl, extract structured data — all from Rust. CLI, REST API

42510 tools

Fetch (Web Content & YouTube Transcripts)

Fetch is a web scraping tool that extracts web content and YouTube transcripts, converting HTML to Markdown with accurat

1572 tools

AgentQL

AgentQL lets you scrape any website and extract structured data to JSON easily—no custom web scraping code needed.

1490 tools

ScrAPI

ScrAPI lets you scrape any website with ease, bypassing bot detection and captchas using residential proxies for reliabl

180 tools

GetWeb

GetWeb offers reliable web scraping and content extraction. Scrape any website with advanced internet scraping and filte

130 tools

Web UI Copy

Web UI Copy converts web pages into standalone, script-free HTML with inlined, base64 resources for advanced analysis an

90 tools

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.

Install

mkdir -p .claude/skills/firecrawl-scraper && curl -L -o skill.zip "https://mcp.directory/api/skills/download/310" && unzip -o skill.zip -d .claude/skills/firecrawl-scraper && rm skill.zip

Installs to .claude/skills/firecrawl-scraper

Stats

Views

144

Installs

Author

jackspace

2 skills published

Links

Source Code

firecrawl-scraper

Install

About this skill

Firecrawl Web Scraper Skill

What is Firecrawl?

API Endpoints

1. /v2/scrape - Single Page Scraping

2. /v2/crawl - Full Site Crawling

3. /v2/map - URL Discovery

4. /v2/extract - Structured Data Extraction

Authentication

Get API Key

Store Securely

Python SDK Usage

Installation

Basic Scrape

Crawl Multiple Pages

Extract Structured Data

TypeScript/Node.js SDK Usage

Installation

Basic Scrape

Crawl Multiple Pages

Extract Structured Data with Zod

Common Use Cases

1. Documentation Scraping

2. Product Data Extraction

3. News Article Scraping

Error Handling

Python

TypeScript

Rate Limits & Best Practices

Rate Limits

Best Practices

Cloudflare Workers Integration

⚠️ Important: SDK Compatibility

Workers Example: Direct REST API

More by jackspace

windows-expert

You might also like

flutter-development

ui-ux-pro-max

drawio-diagrams-enhanced

godot

nano-banana-pro

pdf-to-markdown

Related MCP Servers

Stay ahead of the MCP ecosystem

1. `/v2/scrape` - Single Page Scraping

2. `/v2/crawl` - Full Site Crawling

3. `/v2/map` - URL Discovery

4. `/v2/extract` - Structured Data Extraction