
The web data layer for every Yalc workflow that pulls from outside our database stack. JS rendering, anti-bot, and structured extraction all handled by Claude tool calls.
claude mcp add firecrawl --env FIRECRAWL_API_KEY=fc-xxx -- npx -y firecrawl-mcp
Get a free API key at firecrawl.dev (500 pages free tier). Replace `fc-xxx` with the key, run the command, restart Claude Code. The hosted version is the default. Self host the open source version if your data sensitivity requires it.
The Firecrawl MCP is the official `firecrawl-mcp` package from Mendable AI. It exposes 8 web data verbs as native Claude tools: scrape, batch_scrape, map, search, crawl, extract, interact, and agent. Output is markdown by default, optimized for LLM context windows, with optional structured JSON via schemas.
For Yalc workflows, Firecrawl is the canonical web intake layer. When a prompt says "look up this competitor's pricing", "extract structured data from these 30 vendor pages", "monitor changes on a product changelog", or "search the web for fintech news", Firecrawl handles the wire. Yalc decides what to fetch and what to do with the result.
The Firecrawl MCP sits at the **intake** node for any web sourced data. It complements Crustdata: Crustdata for structured B2B databases, Firecrawl for everything else (vendor sites, blogs, changelogs, product pages, news).
JS rendering, smart wait, anti bot, and caching are all handled inside Firecrawl. Yalc workflows treat each Firecrawl call as a black box that returns clean markdown or structured JSON.
The web intake node. Yalc invokes Firecrawl when the answer lives on a public page rather than in a database. Output flows downstream into Notion, Claude analysis, or comparison reports.
Copy paste prompts for Claude Code that invoke the Firecrawl MCP.
Works in Claude Code (primary), Claude Desktop, Cursor, Codex, and any MCP-compatible client. Open source on GitHub (mendableai/firecrawl-mcp-server). Self-host option available if data sensitivity prohibits third-party scraping.
Functionally equivalent results. The MCP is more convenient inside Claude Code because the verbs become native tool calls Claude composes during conversation. The REST API is better for headless cron jobs and batch pipelines.
Yes. Firecrawl renders JS by default with smart wait. SPAs and React apps work without manual configuration.
For LinkedIn, no. LinkedIn aggressively blocks general scrapers. Use the Unipile MCP instead. For Reddit, technically yes, but Apify's Reddit actors are battle tested for production volume.
Same 500 page allotment as the REST API. The MCP returns the same rate limit errors when you exceed. Plan accordingly with cache hints and selective verbs.
Pass a JSON schema to the scrape or extract verb. Firecrawl runs the page through an LLM with the schema, returns structured JSON. Works most of the time. Complex schemas may need a few iterations.
Same core engine, MIT licensed. Self hosting means you manage the infra (browsers, queues, scaling). Hosted is the convenient option for most teams.
Drop it into Claude Code and orchestrate from your next Yalc prompt.
claude mcp add firecrawl --env FIRECRAWL_API_KEY=fc-xxx -- npx -y firecrawl-mcp