Most operators don't want to buy another AI SDR. They want to build one. The recurring question now is the same: what does it actually take to build a GTM agent that runs your own playbook, not a vendor's roadmap?
This is the operator answer. Pick a runtime, define one skill in markdown, wire the MCPs your workflow already depends on, add guardrails, and ship something that scores a lead and routes it. The runtime sits at the center of an agentic GTM operating system, which is the broader frame this piece slots into. Here we go one level deeper into the build.
What a GTM agent actually is in 2026
The marketing version of a GTM agent is a black box that books meetings while you sleep. The operator version is more honest. A GTM agent is a runtime plus a skill plus a set of MCP connections. The runtime hosts the model and the tool calls. The skill is a markdown file that tells the model what job it does and what guardrails apply. The MCPs are the data and action endpoints the agent reaches for during the job.
If you can describe the workflow you would run by hand, you can build a GTM agent that runs it. The bar is not autonomous reasoning. It is reliable orchestration.
If you have already read the AI SDR tools field map, this is the layer underneath any of those categories. Point tools and full SDR replacements both make a bet about how your workflow runs. A self built agent makes no bet. It runs the workflow you wrote.
Three properties separate a real GTM agent from a vendor demo. The agent reads from real data sources, not a static export. It writes to the systems your team already runs on. And every prompt and every connection lives in code you own, version, and rewrite. If you cannot read the prompt, you do not own the agent.
Pick the runtime (Claude Code, Agent SDK, custom)
There are three credible runtimes in 2026 for building a GTM agent. Each one solves a different problem.
Claude Code is the fastest path for operators who think in workflows, not infrastructure. The runtime sits on your laptop, talks to MCPs over the standard protocol, and runs skills written in plain markdown. You get tool calls, file I/O, terminal access, and the model in one process. Most of the workflows in the playbook on using Claude Code for sales ship as a single skill that runs in this exact runtime. Same idea on the demand side, covered in Claude Code for marketing teams. Operator first, no deploy needed.
Anthropic Agent SDK is the right pick when you outgrow the local laptop and need to run the agent as a service. You wrap the same logic in code, host it on a server, expose an endpoint your other systems can call. The cost is more infrastructure and a slower iteration loop. The benefit is you can call the agent from a cron, a webhook, or another agent.
Custom runtime means you write your own loop around the API. You read input, build the prompt, call tools, manage retries, log output. The reason to do this is full control over scheduling, observability, and how you store state. The reason to avoid it is everything else. Most operators never need to go here. The path is laptop first, server next, custom only when neither fits.
Define the skill as markdown
A skill is the contract between the operator and the agent. It is one markdown file that tells the agent the job, the inputs, the steps, the guardrails, and the outputs. No graph, no canvas, no clickable UI.
A working GTM skill has six sections. Purpose says what the agent does in one sentence. Inputs lists what the agent expects (a company domain, a person email, a list of leads). Steps describes the workflow in plain English (look up the company, enrich the people, score the fit, write the brief). Tool calls names the MCPs and the specific calls allowed. Guardrails specifies cost ceilings, rate limits, and what the agent must not do. Outputs defines what the agent returns and where it writes.
The reason markdown wins over a workflow canvas is that markdown compounds. Every run teaches you something about the skill: an edge case missed, a phrasing that backfires, a tool call that times out. You open the file, fix the line, save. Next run runs against the sharper version. A node graph forces a redeploy. A vendor UI does not let you read the underlying instructions at all.
Wire the data MCPs (Crustdata, FullEnrich, PredictLeads)
A GTM agent without data is a chatbot. The first wiring step is the data layer.
Crustdata is the workhorse. People search, company enrichment, signals, all through one API. The MCP exposes those calls to the agent as named tools. You write the skill like this: if the lead has no LinkedIn URL, call the people search with the email; if the company is missing employee count, call the company enrich with the domain. The MCP handles auth, rate limits, retries.
FullEnrich is the waterfall layer for email and phone. When Crustdata gets you the person but the email is empty, the agent calls FullEnrich to fill the gap. The MCP returns verified contact data with confidence scores. The skill decides what to do when confidence is below a threshold (skip, queue for review, retry).
PredictLeads supplies the signal layer. Hiring posts, job openings, technographic changes. The agent watches the feed for triggers that match your ICP definition, then routes a fresh lead into the qualification workflow. That is signal first outbound at the agent level.
The broader pattern of wiring data tools through MCPs is covered in the explainer on MCPs for sales teams, which goes deeper on the protocol itself. The rule here is simple. Pick the data MCPs your workflow actually needs, and stop there.
Wire the action MCPs (Unipile, HubSpot, Notion, Slack)
Data MCPs are read. Action MCPs are write. A GTM agent earns its keep when it writes to the systems your team already runs on.
Unipile is the LinkedIn and email action layer. The agent sends an invite, drops a message, schedules a follow up, all through API calls that respect per account daily caps. Most of the LinkedIn workflows worth shipping as an agent live as one skill called from a Unipile campaign.
HubSpot is the CRM write target. When the agent qualifies a lead, it writes the score, the rationale, and the next action back to the contact record. No CSV exports, no Zap. The agent owns the record update directly.
Notion is where the agent writes its state. Run logs, enrichment caches, signal histories. The MCP exposes Notion as a structured database the agent can read on the next run, which is how middle mile work compounds across executions.
Slack is the human in the loop layer. When the agent finishes a batch or hits an edge case it cannot resolve, it posts a thread to the right channel with the context and the proposed next action. The operator reviews, replies, and the agent reads the reply on the next run.
Add guardrails (cost, rate, eval)
The fastest way to lose trust in a new GTM agent is to skip guardrails. Three are mandatory.
Cost guardrails. Set a hard ceiling on tokens and tool calls per run. If the skill loops or the model gets stuck, the agent halts before the bill ships. Most cost runaway happens on the first production run, not the 100th. Cap it before you ship.
Rate guardrails. Every data MCP has rate limits. Every action MCP has per account caps. The agent must respect both. Unipile has different daily caps for different LinkedIn account types. Crustdata has per minute and per day ceilings. Write the limits into the skill so the agent never runs hot.
Eval guardrails. Before the agent writes to production systems, run it on a sample of real cases and grade the output. Sample 30 leads, score them by hand, score them with the agent, compare. If agreement is below 80 percent, the prompt is wrong, not the leads. Fix and rerun. This is the same discipline that makes the operator playbook for B2B lead generation repeatable. Measure the output before you trust the system.
Worked example: a lead qualification agent
The simplest working agent is a qualifier. Input: a list of inbound leads with an email and a company domain. Output: a score from 1 to 5, a one paragraph rationale, and a routed action (sales call, nurture, disqualify).
The skill reads like this. For each lead, look up the company through Crustdata. Pull industry, headcount, funding stage, and recent signal events. Look up the person through Crustdata. Pull title, seniority, tenure, and recent role moves. If the email is empty, fill through FullEnrich. Match the company shape against the ICP definition in the skill. Match the person shape against the buyer persona in the skill. Score from 1 to 5. Write the rationale. If the score is 4 or 5, post a Slack message to the AE channel with the brief and propose a call. If the score is 1 or 2, mark the contact as disqualified in HubSpot. Anything in the middle goes to a nurture list in Notion.
Five minutes of work per lead, automated. The skill is roughly 150 lines of markdown. The MCP wiring is six lines of config. The first run grades 50 leads in under 10 minutes on a laptop. The 50th run is sharper because every override the operator makes gets written back into the rationale logic of the skill.
From local skill to scheduled production
A skill that runs on demand from your laptop is already useful. A skill that runs on a schedule, ingests fresh signals, and writes back to your CRM is a system.
The path is gradual. First, run the skill manually for two weeks until you trust the output. Watch every run. Override anything that looks off. Edit the skill after every override. Second, point the skill at a real signal feed (PredictLeads or a HubSpot view) and let it process new leads as they appear. Still manual trigger, but now the input is live.
Third, schedule it. The Anthropic Agent SDK lets you wrap the same skill behind an endpoint that runs on a cron. The skill code does not change, only the runtime around it. Fourth, add observability. Log every run, every tool call, every override into Notion. Now you can answer the questions a finance team will ask in month two. How much does the agent cost. How many leads does it qualify. Where does it disagree with the human.
The pattern across all four steps is the same. The skill stays in markdown. The MCPs stay decoupled. The runtime grows around the skill, not on top of it.
Common mistakes when shipping the first agent
Most first agents fail in predictable ways. Avoid these.
Skipping the eval. Operators ship the agent because the first 5 outputs looked right. By output 50, the failure mode is obvious and embarrassing. Always grade a real sample first.
Wiring too many MCPs. A first agent does not need 8 tools. Two data MCPs and two action MCPs is enough for a qualifier. Add more only when the workflow demands it.
Hiding the prompt. Some operators put the prompt in code and stop showing it to the team. The team loses the ability to read and edit. The agent becomes a black box that one person owns. Keep the prompt in markdown, in a repo, where anyone on the team can open and propose a change.
No guardrails. First run runs hot, bills shock the team, agent gets shelved. Cost and rate limits are not optional.
Optimizing the runtime before the skill works. Operators jump from Claude Code to a custom orchestration framework before the skill has proven itself. The skill is the asset. The runtime is the host. Iterate the skill until it converges, then move the runtime.
What to do this week
Pick one workflow you would run by hand. Open a fresh markdown file. Write the Purpose, Inputs, Steps, Tool calls, Guardrails, and Outputs sections. Pick a runtime (Claude Code for the first one is the right call). Wire two data MCPs and two action MCPs. Add a cost cap, a rate cap, and an eval set of 30 real cases. Run the skill manually until the output is right. Then schedule it.
The shortcut is to start from a skill that already runs and rewrite it for your playbook. The Unipile campaign skill is the closest reference for a LinkedIn outreach agent. Clone it, change the inputs, change the rationale prompt, change the routing logic. Ship in a week.
That is what it takes to build a GTM agent in 2026. Not a graph of nodes. Not a vendor canvas. One markdown skill, four MCP connections, and a runtime that runs on the laptop you already have.