Firecrawl MCP review and Yalc Framework

Install

Add Firecrawl to Claude Code in one command

claude mcp add firecrawl --env FIRECRAWL_API_KEY=fc-xxx -- npx -y firecrawl-mcp

Get a free API key at firecrawl.dev (500 pages free tier). Replace `fc-xxx` with the key, run the command, restart Claude Code. The hosted version is the default. Self host the open source version if your data sensitivity requires it.

What it does

Firecrawl, plainly

The Firecrawl MCP is the official `firecrawl-mcp` package from Mendable AI. It exposes 8 web data verbs as native Claude tools: scrape, batch_scrape, map, search, crawl, extract, interact, and agent. Output is markdown by default, optimized for LLM context windows, with optional structured JSON via schemas.

For Yalc workflows, Firecrawl is the canonical web intake layer. When a prompt says "look up this competitor's pricing", "extract structured data from these 30 vendor pages", "monitor changes on a product changelog", or "search the web for fintech news", Firecrawl handles the wire. Yalc decides what to fetch and what to do with the result.

Where it slots in

Position in the GTM operating system

Intake

→

Enrich

→

Score

→

Route

→

Draft

→

Send

→

Listen

The Firecrawl MCP sits at the **intake** node for any web sourced data. It complements Crustdata: Crustdata for structured B2B databases, Firecrawl for everything else (vendor sites, blogs, changelogs, product pages, news).

JS rendering, smart wait, anti bot, and caching are all handled inside Firecrawl. Yalc workflows treat each Firecrawl call as a black box that returns clean markdown or structured JSON.

The Yalc Framework

Deploying the Firecrawl MCP inside Yalc workflows

Workflow position

The web intake node. Yalc invokes Firecrawl when the answer lives on a public page rather than in a database. Output flows downstream into Notion, Claude analysis, or comparison reports.

Prompt patterns

Copy paste prompts for Claude Code that invoke the Firecrawl MCP.

Yalc, scrape these 30 competitor pricing pages via Firecrawl. Extract plan name, monthly price, included features into a structured table. Write to "Competitive intel" in Notion. → Yalc batches scrape calls with a JSON schema, normalizes output, writes to Notion.

Yalc, search the web via Firecrawl for "Series B fintech Germany 2026" and pull the top 20 results. Cross reference against our ICP list, surface unworked matches. → Yalc uses Firecrawl search, fuzzy matches against Notion, outputs candidates.

Yalc, monitor this product changelog page weekly. When a new entry appears mentioning "API" or "integration", summarize and post to #product. → Yalc uses Firecrawl scrape with cache busting, diffs against last fetch, classifies via Claude.

Chaining recommendations

UpstreamYalc prompt with a URL or query (no upstream)

DownstreamFirecrawl output → Claude (analysis) → Notion or Slack

Anti patterns to avoid

Don't scrape the same URL on a tight loop without caching. Firecrawl supports cache hints. Use them. Otherwise you'll burn the free tier in a day.

Don't use Firecrawl when a vendor has an official API. Even on the free tier, Firecrawl is slower and less structured than a real API. Vendor API first, Firecrawl as fallback.

Don't crawl entire sites when you only need 5 pages. The crawl verb is powerful but expensive. Use scrape on specific URLs unless you genuinely need link following.

Compatibility

Works in Claude Code (primary), Claude Desktop, Cursor, Codex, and any MCP-compatible client. Open source on GitHub (mendableai/firecrawl-mcp-server). Self-host option available if data sensitivity prohibits third-party scraping.

Operator take

Pros, cons, who it's for

Pros

500 page free tier. Real workflows ship without paying.
Open source. 100k+ GitHub stars. Active maintenance.
8 verbs cover the full web data surface (scrape, crawl, search, extract, interact, agent, etc.).
JS rendering, smart wait, anti-bot all handled. No tuning needed.
Markdown output by default. LLM context-window optimized.

Cons

Sites with aggressive anti-bot (Cloudflare strict, Datadome) still occasionally fail.
Crawl verb is expensive. Easy to burn budget if not careful.
JSON schema extraction works most of the time. Complex schemas need iteration.
Self hosting means maintaining browsers, queues, scaling. Real work.

Who it's for

GTM engineers building agentic research workflows
Operators who need pricing, changelog, news scraping in regular workflows
Data teams piloting LLM driven web data ingestion at small to mid scale

Firecrawl (vendor review)

Pricing tiers and plan choice context.

→

Web Browsing skill

First party Yalc skill that wraps the four core Firecrawl verbs with retries and caching.

→

Alternatives

MCPs to consider instead

Apify MCP

Switch when you need a marketplace of pre built scrapers (Reddit, LinkedIn, Twitter) rather than a general crawler.

→

Bright Data MCP

Switch when residential proxies and aggressive anti bot evasion at scale matter most.

→

Direct Firecrawl REST API

Skip the MCP if you're running headless cron jobs outside Claude Code. The REST API is more efficient for batch workloads.

→

FAQ

Frequently asked

How does the MCP compare to Firecrawl's REST API?

Functionally equivalent results. The MCP is more convenient inside Claude Code because the verbs become native tool calls Claude composes during conversation. The REST API is better for headless cron jobs and batch pipelines.

Can I scrape JavaScript heavy sites?

Yes. Firecrawl renders JS by default with smart wait. SPAs and React apps work without manual configuration.

Does the MCP work for LinkedIn or Reddit?

For LinkedIn, no. LinkedIn aggressively blocks general scrapers. Use the Unipile MCP instead. For Reddit, technically yes, but Apify's Reddit actors are battle tested for production volume.

How does the free tier behave inside the MCP?

Same 500 page allotment as the REST API. The MCP returns the same rate limit errors when you exceed. Plan accordingly with cache hints and selective verbs.

How do I extract structured data from a page?

Pass a JSON schema to the scrape or extract verb. Firecrawl runs the page through an LLM with the schema, returns structured JSON. Works most of the time. Complex schemas may need a few iterations.

Is the open source version the same as hosted?

Same core engine, MIT licensed. Self hosting means you manage the infra (browsers, queues, scaling). Hosted is the convenient option for most teams.

Firecrawl MCP and the Yalc Framework