Prospecting · Claude Code skill

Web Browsing (Firecrawl wrapper) skill and the Yalc Framework

The single source of truth for web scraping in Yalc workflows. Wraps Firecrawl with the right retries, caching, and output conventions.

Yalc Fit Score

9/10

License

MIT (Yalc)

Backend

Firecrawl

Verbs

4 core ops

Last reviewed

2026-04-29

Trigger phrases

Say this to fire the Web Browsing (Firecrawl wrapper) skill

Any of these natural language phrases activates the skill inside Claude Code.

browsing the web

scraping a URL

crawling a website

extracting data from a page

web search

fetching page content

reading a webpage

converting a page to markdown

What it does

Web Browsing (Firecrawl wrapper), plainly

The Web Browsing skill is the canonical Yalc wrapper for any web scraping or crawling operation, backed by Firecrawl. Four core verbs: scrape (one URL to clean markdown), crawl (follow links across a site), search (web query with results), and extract (structured JSON via schema). Output is consistent across all verbs and ready for downstream Yalc skills.

For Yalc workflows, this skill is the gateway for any non-trivial web read. Direct Firecrawl MCP calls work for simple cases but bypass the playbook (cache hints, retry policy, output normalization). The skill is the abstraction that makes web scraping a one-liner inside any Yalc prompt.

Where it slots in

Position in the GTM operating system

Intake

→

Enrich

→

Score

→

Route

→

Draft

→

Send

→

Listen

The Web Browsing skill sits at the **intake** node for any web-sourced data. It's the most-used intake skill across Yalc workflows because so many GTM operations start with "look up this URL" or "search the web for X".

The skill complements `apify-reddit-scraping` (platform-specific) by handling the general web case. Where Apify is the right tool for Reddit, LinkedIn engagement, and Twitter, this skill is the right tool for vendor sites, blogs, news pages, and competitive landing pages.

The Yalc Framework

Running the Web Browsing (Firecrawl wrapper) skill end to end

Workflow position

The general web intake. Yalc invokes this skill when the answer lives on a public web page rather than in a database or platform-specific source.

Required inputs

→ URL or search query
→ Operation type (scrape, crawl, search, extract)
→ [object Object]
→ [object Object]

Outputs

→ Clean markdown for scrape and crawl operations
→ Structured JSON for extract operations
→ Search result list with title, URL, snippet for search operations
→ Output written to a Yalc-conventional path or returned inline

Chaining recommendations

UpstreamYalc prompt with URL or query → web-browsing

DownstreamOutput → Claude (analysis) or Notion (writeback) or Slack (alert)

Anti patterns to avoid

Don't crawl entire sites when you only need a few pages. Use `scrape` on specific URLs unless you genuinely need link following. Crawl is expensive on credits and slow.

Don't skip cache hints on repeat scrapes. The skill supports cache hints; use them. Otherwise you'll burn the Firecrawl free tier in a day.

Don't use this skill for platforms with native scraping resistance (Reddit, LinkedIn, Twitter). Use `apify-reddit-scraping` or platform-specific tools instead.

Operator take

Pros, cons, who it's for

Pros

4 verbs cover the full general-web scraping surface
Cache hints + retry policy prevent budget burn
Output normalized across verbs, simplifies downstream skills
Free tier sufficient for typical Yalc volume
Battle tested as the upstream of seo-audit, content-strategist, and competitive-intel workflows

Cons

Firecrawl-specific. For other web scraping vendors (Apify, ScrapeGraph, Bright Data), this skill doesn't apply.
Sites with aggressive anti-bot may still fail. Apify's platform-specific actors are better for Reddit, LinkedIn, Twitter.
JSON schema extraction works most of the time; complex schemas require iteration.
Crawl operations are slower than competitor tools that focus exclusively on crawling.

Who it's for

Yalc operators running content audits, competitive intel, vendor research
Seo audit workflows (this skill is the underlying crawler)
Any GTM workflow that needs structured web data without per-platform tooling

Dependencies

What this skill expects to find

MCP servers

Firecrawl MCP (alternative entry point for raw access)

Environment variables

FIRECRAWL_API_KEY

Backed by Firecrawl. Free tier (500 pages/month) is sufficient for most Yalc workflows with cache hints enabled. Higher volume requires a paid Firecrawl plan. The skill prefers cached results unless explicitly told to refetch.

The Web Browsing (Firecrawl wrapper) ecosystem inside Yalc

Firecrawl MCP

Underlying transport. The skill wraps the MCP with cache and retry conventions.

→

Alternatives

Skills that overlap

Direct Firecrawl MCP calls

Skip the skill for one-off scrapes outside the Yalc playbook.

→

Apify Reddit Scraping skill

Switch when scraping Reddit, LinkedIn, or Twitter where general crawlers fail.

→

Direct Firecrawl REST API

Switch when running headless cron jobs outside Claude Code.

→

FAQ

Frequently asked

How is this different from the Firecrawl MCP directly?

The skill adds Yalc-specific conventions (cache hints, output paths, retry policy). The MCP gives raw API access. For one-off scrapes, the MCP is fine. For repeated workflows, the skill's conventions prevent budget waste.

When should I use crawl vs scrape?

Scrape one specific URL. Crawl when you need to follow links across an entire site. Crawl is meaningfully more expensive (credits per page across the whole site). Default to scrape; crawl only when you genuinely need site-wide content.

How does cache hinting work?

Pass a cache TTL in the input (e.g. "cache for 1 hour" or "use cached if available"). The skill respects the TTL via Firecrawl's cache headers. For repeat scrapes within the cache window, no API credit is consumed.

Does the skill handle JS-heavy sites?

Yes via Firecrawl's JS rendering. SPAs and React apps work without manual configuration. Smart wait detects when content has finished loading.

Can the skill extract structured data from a page?

Yes via the extract verb plus a JSON schema. Firecrawl runs the page through an LLM with the schema, returns structured JSON. Most cases work; complex schemas need iteration.

What's the right cache TTL?

Depends on data volatility. Vendor pricing pages cache for 24 hours. News pages cache for 1 hour. Product changelogs cache for 1 to 4 hours. Static documentation caches for 7 days. Default to 24 hours and tighten only when freshness matters.

Get the Web Browsing (Firecrawl wrapper) skill

Clone the Yalc skill set, drop in your env, run from your next Claude Code session.

gh repo clone Othmane-Khadri/YALC-the-GTM-operating-system && cp -r YALC-the-GTM-operating-system/.claude/skills/web-browsing ./.claude/skills/

View on GitHub We deploy it for you