The single source of truth for web scraping in Yalc workflows. Wraps Firecrawl with the right retries, caching, and output conventions.
Any of these natural language phrases activates the skill inside Claude Code.
The Web Browsing skill is the canonical Yalc wrapper for any web scraping or crawling operation, backed by Firecrawl. Four core verbs: scrape (one URL to clean markdown), crawl (follow links across a site), search (web query with results), and extract (structured JSON via schema). Output is consistent across all verbs and ready for downstream Yalc skills.
For Yalc workflows, this skill is the gateway for any non-trivial web read. Direct Firecrawl MCP calls work for simple cases but bypass the playbook (cache hints, retry policy, output normalization). The skill is the abstraction that makes web scraping a one-liner inside any Yalc prompt.
The Web Browsing skill sits at the **intake** node for any web-sourced data. It's the most-used intake skill across Yalc workflows because so many GTM operations start with "look up this URL" or "search the web for X".
The skill complements `apify-reddit-scraping` (platform-specific) by handling the general web case. Where Apify is the right tool for Reddit, LinkedIn engagement, and Twitter, this skill is the right tool for vendor sites, blogs, news pages, and competitive landing pages.
The general web intake. Yalc invokes this skill when the answer lives on a public web page rather than in a database or platform-specific source.
FIRECRAWL_API_KEYBacked by Firecrawl. Free tier (500 pages/month) is sufficient for most Yalc workflows with cache hints enabled. Higher volume requires a paid Firecrawl plan. The skill prefers cached results unless explicitly told to refetch.
The skill adds Yalc-specific conventions (cache hints, output paths, retry policy). The MCP gives raw API access. For one-off scrapes, the MCP is fine. For repeated workflows, the skill's conventions prevent budget waste.
Scrape one specific URL. Crawl when you need to follow links across an entire site. Crawl is meaningfully more expensive (credits per page across the whole site). Default to scrape; crawl only when you genuinely need site-wide content.
Pass a cache TTL in the input (e.g. "cache for 1 hour" or "use cached if available"). The skill respects the TTL via Firecrawl's cache headers. For repeat scrapes within the cache window, no API credit is consumed.
Yes via Firecrawl's JS rendering. SPAs and React apps work without manual configuration. Smart wait detects when content has finished loading.
Yes via the extract verb plus a JSON schema. Firecrawl runs the page through an LLM with the schema, returns structured JSON. Most cases work; complex schemas need iteration.
Depends on data volatility. Vendor pricing pages cache for 24 hours. News pages cache for 1 hour. Product changelogs cache for 1 to 4 hours. Static documentation caches for 7 days. Default to 24 hours and tighten only when freshness matters.
Clone the Yalc skill set, drop in your env, run from your next Claude Code session.
gh repo clone Othmane-Khadri/YALC-the-GTM-operating-system && cp -r YALC-the-GTM-operating-system/.claude/skills/web-browsing ./.claude/skills/