MPIsaac Ventures
Back to Blog

Claude Code MCP: What Operators Actually Need to Know (2026)

Michael Isaac
Michael Isaac
Operator. 30 yrs in enterprise AI.18 min read

I started using Model Context Protocol in Claude Code the week Anthropic shipped the spec in late 2024. By mid-2025 it was my default for any serious agent work. As of 2026-04-28 on macOS 14.6, I personally verified MCP end-to-end in Claude Code 2.0.18 (stdio + streamable HTTP, OAuth 2.1 per modelcontextprotocol.io/specification/2025-11-25/basic/authorization, fetched 2026-04-28) and Cursor 0.45 (stdio, per docs.cursor.com/context/mcp, fetched 2026-04-28); ChatGPT's connector stays beta and plan-gated (help.openai.com/en/articles/12584461, fetched 2026-04-28). Most writing on MCP is still introductory. This page is what I wish I'd had when figuring out which servers to actually run.

TL;DR

Use MCP in Claude Code when the same tools need to work across multiple agents (Claude Code, Cursor, ChatGPT desktop) or when a vendor has shipped an official server worth using rather than wrapping by hand. Skip MCP for one-off scripts: a custom slash command or hook is faster to write, faster to load, and uses zero tokens until invoked.

The operational cost most operators get wrong is context window pressure, but the shape of that cost has shifted. As of the current Claude Code MCP docs (code.claude.com/docs/en/mcp#scale-with-mcp-tool-search, fetched 2026-04-28), Tool Search is the default loading strategy: instead of dumping every connected server's full tool list into the system prompt at session start, the host advertises a search index and the model pulls specific tools on demand. What still costs context: per-server name and short description metadata, any tool the model has already pulled in the current session (it stays loaded), tools marked alwaysLoad, and projects that flip ENABLE_TOOL_SEARCH off. Some non-Anthropic models and proxy setups also do not honor deferred loading.

Anecdotally, on one specific local setup I measured on 2026-04-28 (Claude Code 2.0.18 against claude-opus-4-7, five servers configured: github at v0.6.0 with GITHUB_READ_ONLY=1 and GITHUB_TOOLSETS=repos,pull_requests,issues, filesystem 2025.11.0 scoped to one directory, linear over HTTP+OAuth, sentry over HTTP+OAuth, plus a 4-tool custom Python server), the system prompt captured via claude --debug > debug.log at session start tokenized to roughly 4.1K MCP-attributable tokens with Tool Search on, and roughly 19K with ENABLE_TOOL_SEARCH=0. Token counts came from feeding the extracted MCP block to tiktoken with the cl100k_base encoding. The repro fixture lives in the companion repo at github.com/MPIsaac-Per/agentinfra-examples under mcp/token-overhead/. Treat the figures as a calibration point, not a benchmark. The only honest way to know what a given project pays is to repeat the procedure: run claude --debug, capture the system prompt at session start, and count tokens against the model's tokenizer.

Pin server versions, audit source for any server that touches the filesystem, and prefer stdio transport for local work and HTTP+OAuth for hosted multi-user setups. Per the installation scopes docs (code.claude.com/docs/en/mcp#mcp-installation-scopes, fetched 2026-04-28), local and user scoped server entries live under the relevant project key in ~/.claude.json, project scoped entries live in a committed .mcp.json at the project root, and enterprise managed configuration is loaded separately. There is no ~/.claude/mcp.json file in current Claude Code. Project scope is almost always the right home for servers tied to a specific repo.

The wire-format and authorization layers clear a high bar: JSON-RPC framing and capability negotiation are stable across the hosts I tested, and the 2025-11-25 OAuth 2.1 profile handles client registration with a sensible tier of options rather than mandating Dynamic Client Registration. The ecosystem around those layers does not clear the same bar on four axes: server quality varies wildly between vendor-official, reference, and community servers; observability has no standard trace-context propagation and no structured error taxonomy; client feature parity is per-feature, not blanket; and per-tool versioning is absent, so a server can change a tool's input schema between calls and the client has no way to detect it.

The mental model

Model Context Protocol is a JSON-RPC schema for an LLM client (Claude Code, ChatGPT desktop, Cursor, etc.) to discover and call capabilities exposed by a server. The spec is published at modelcontextprotocol.io and reference implementations live in TypeScript and Python under the modelcontextprotocol GitHub org.

The protocol defines three primitives:

  1. Tools are model-callable functions. The server advertises a name, description, and JSON Schema input. The model decides when to call them.
  2. Resources are addressable read-only data the server can expose: files, database rows, API responses.
  3. Prompts are server-defined templates the user (not the model) can invoke. In Claude Code these surface as slash commands shaped /mcp__servername__promptname. Per the Claude Code MCP docs at code.claude.com/docs/en/mcp#use-mcp-prompts-as-commands (fetched 2026-04-28), arguments are passed positionally, for example /mcp__github__list_prs owner repo.

Transport is either stdio (the client spawns the server as a subprocess) or streamable HTTP with optional OAuth 2.1. The 2025-03-26 revision deprecated the older HTTP+SSE transport in favor of streamable HTTP.

What MCP is not: it is not a runtime, not an agent framework, not a function-calling replacement. The model still decides when to call a tool. MCP just standardizes how the tool gets registered and how the call gets routed.

The landscape

As of 2026-04-28, the MCP server ecosystem splits five ways:

  • Anthropic-published reference servers in modelcontextprotocol/servers: filesystem, git, github, postgres, sqlite, slack, sentry, brave-search, fetch, puppeteer, memory. Filesystem and git have the longest release history and highest issue-close rate, and I have relied on them across projects since late 2024 once pinned to a patched version.

  • Vendor-official servers shipped by the underlying SaaS. The subset I have personally connected and exercised at least one tool call against on 2026-04-28:

    VendorSourceTransport testedAuth mode
    GitHubgithub.com/github/github-mcp-serverstdio (container)PAT, GitHub App
    Lineargithub.com/linear/linear-mcpHTTPOAuth 2.1
    Cloudflaregithub.com/cloudflare/mcp-server-cloudflarestdioAPI token
    Stripegithub.com/stripe/agent-toolkit (mcp/)stdiorestricted API key
    Sentrygithub.com/getsentry/sentry-mcpHTTPOAuth 2.1
    Vercelvercel.com/docs/mcp/vercel-mcpHTTPOAuth 2.1

    "Official" reduces one risk (the integration tracks the vendor's own API and ships current auth flows). It is not a synonym for "trustworthy by default." Concrete criteria before adopting: who maintains it, the auth model (OAuth 2.1 with scoped tokens versus a long-lived API key), whether scopes are granular enough for least-privilege grants, whether a read-only mode exists, the release cadence, server-side audit logs, and data-handling posture.

  • Community servers in awesome-mcp-servers and on npm/PyPI. Quality is wildly variable. Treat as "read the source before running."

  • Hosted MCP gateways like Smithery and Composio that expose a wide catalog behind one HTTP endpoint. Convenient, but you're trusting their gateway with every tool call's payload.

  • Your own servers, written using the MCP SDKs. Per modelcontextprotocol.io/sdk (fetched 2026-04-28), TypeScript and Python are first-party. Kotlin, Java, C#, and Swift ship as partner-maintained official SDKs. Go, Rust, and Ruby are listed with varying maturity.

A "production-grade" bar before depending on any server: tagged releases within 90 days or an explicit maintenance statement, scoped auth, bounded output (pagination plus a documented size cap), structured error categories, audit logging of tool calls, and an automated test suite. Any server missing three is a "wrap it yourself in an afternoon" candidate.

For homegrown servers: write your own when the integration is an internal API or anything touching sensitive data. Skip the homegrown route when the target is commodity SaaS and a vendor-official server clears the production-grade bar.

Claude Code reads MCP server configuration with three scopes: local (private to user/project, in ~/.claude.json, default), project (committed .mcp.json at project root), and user (private, all projects, also in ~/.claude.json). When a server name appears in multiple scopes, local wins over project, and project wins over user. CLI syntax: claude mcp add my-server --scope project -- npx -y some-mcp-package --flag value. HTTP and SSE servers use --transport http or --transport sse with a URL.

What actually matters operationally

Integration count is a weak proxy for operational fit. Four dimensions decide whether an MCP setup helps or hurts.

Token weight at startup. Tool Search defers MCP tools by default. What still costs context: server names and short descriptions the search ranker has to surface, any tool already discovered in the current session, tools marked alwaysLoad, and projects with ENABLE_TOOL_SEARCH off. There is also a per-tool description truncation in the index, which means a server that crams critical usage notes into the long tail of its description gets them silently clipped. Per-project .mcp.json scoping beats user-global ~/.claude.json for any server not used everywhere.

Authentication surface. Stdio servers inherit the launching shell's environment, which means anything in ~/.zshrc or a keychain export is fair game. HTTP servers with OAuth 2.1 handle this more cleanly. Per the 2025-11-25 authorization page (modelcontextprotocol.io/specification/2025-11-25/basic/authorization#client-registration-approaches, fetched 2026-04-28), clients support, in order: pre-registered client information, Client ID Metadata Documents, Dynamic Client Registration (RFC 7591), and manually entered client information. DCR is one mechanism, not a mandate. Many community servers sidestep this by reading a .env file in the server's working directory, which is fine for local single-user setups and a footgun the moment .mcp.json gets committed.

Failure modes during a tool call. At the protocol layer, MCP defines no application-level retry, no circuit breaker, no rate-limit backoff: a 429 from upstream surfaces to the model as a free-form error string. At the host layer, current Claude Code does perform automatic reconnection with exponential backoff for HTTP and SSE transport disconnects (code.claude.com/docs/en/mcp#automatic-reconnection, fetched 2026-04-28). Connection-level resilience is the host's job. Application-level retries, idempotency, error categorization, and rate-limit backoff are the server's job.

Observability and output sizing. Stdio servers log to stderr by default, which Claude Code captures but does not surface unless invoked with --debug. HTTP servers can emit OpenTelemetry traces, but only if the server author instruments them; there is no standard MCP-level trace context propagation in the 2025-11-25 spec. Claude Code now caps MCP tool output and warns above a threshold (code.claude.com/docs/en/mcp#mcp-output-limits-and-warnings, fetched 2026-04-28), with the cap configurable via MAX_MCP_OUTPUT_TOKENS. Output above the cap is truncated, the model sees a warning rather than the full payload, and the operator-visible symptom is "the agent silently lost half the result." Server authors can mitigate by paginating and by setting the anthropic/maxResultSizeChars annotation. Budget MCP output explicitly per project.

Detailed teardowns

The official GitHub MCP server

Source at github.com/github/github-mcp-server (verified 2026-04-28). Exposes the GitHub REST and GraphQL APIs as MCP tools.

Architecture: single Go binary, runs as stdio or HTTP. Authenticates via PAT or GitHub App. GitHub also operates a hosted remote variant at api.githubcopilot.com/mcp, which routes every tool call through GitHub's hosted infrastructure as an additional sub-processor.

Cost and ops: free; cost is GitHub API quota plus Claude tokens. The full surface is large (60+ tools), but the server ships first-class knobs for trimming:

  • GITHUB_TOOLSETS selects logical groupings (repos, issues, pull_requests, actions, etc.). Default loads most.
  • GITHUB_TOOLS is the finer-grained allowlist for individual tools.
  • GITHUB_READ_ONLY=1 strips every mutating tool from the advertised list.
  • Dynamic tool discovery (--dynamic-toolsets) flips the model from eager to "advertise a small index and request specific toolsets on demand." Still labelled experimental.

I run GITHUB_READ_ONLY=1 plus a narrow GITHUB_TOOLSETS list for any project where the agent is reviewing rather than authoring.

When it's the right call: PR review, issue triage, or repo automation regularly from inside Claude Code, especially with the toolset and read-only knobs.

When it's the wrong call: occasional GitHub work. A slash command shelling out to gh uses zero tokens until invoked.

The reference filesystem server

@modelcontextprotocol/server-filesystem exposes read, write, list, search, and edit operations on directories whitelisted at launch.

Architecture: tiny Node.js binary, stdio only. Whitelisted root paths are passed as positional arguments. Newer versions also support the MCP Roots capability for dynamic workspace roots.

Cost and ops: cheapest server in the ecosystem. The interesting question is whether to run it at all in Claude Code, since Claude Code already has Read, Write, Edit, and Glob built in. The answer in 2026 is mostly no for Claude Code, yes for other clients (Cursor, ChatGPT desktop).

Version hygiene is not optional. Versions prior to the patched 2025.7.1 release shipped a path-validation bug where symlink targets and prefix-matching edge cases let the server read or write outside whitelisted roots (CVE-2025-53110, see advisories.gitlab.com/npm/%40modelcontextprotocol/server-filesystem/CVE-2025-53110/, fetched 2026-04-28). Pin to a patched version explicitly, never latest. Any local copy not updated since mid-2025 should be treated as vulnerable and audited if it had access to sensitive paths.

When it's the right call: any non-Claude-Code client that needs filesystem access. Run inside a container with the working directory bind-mounted read-only where possible.

When it's the wrong call: in Claude Code, by default. Native tools are tighter, faster, produce cleaner diffs.

Smithery (hosted gateway)

One HTTP endpoint, broad catalog of integrations, OAuth handled centrally.

Architecture: connect Claude Code to Smithery's HTTP endpoint with an API key. Smithery proxies tool calls to upstream services and handles auth.

Cost and ops: tiered pricing with a free tier. Live pricing at smithery.ai/pricing is the only source worth treating as current.

When it's the right call: small teams that want Notion, Linear, Slack, and a handful of others without operating five separate servers.

When it's the wrong call: any tool call payload that includes data you wouldn't paste into a third-party form. Every call routes through Smithery as a sub-processor in addition to the upstream vendor and Anthropic.

A custom internal server

Writing your own MCP server with the Python or TypeScript SDK takes about an afternoon for a useful one.

Architecture: a function decorated with @mcp.tool() (Python) or registered via server.tool() (TS). Run as stdio or HTTP.

Cost and ops: your time to write and maintain. A stdio server has no hosting bill but consumes local CPU, RAM, and a process slot, and tool descriptions and outputs cost Claude context tokens.

When it's the right call: any internal tool, dataset, or workflow your team uses repeatedly. A 200-line MCP server wrapping three internal APIs is more valuable than 10 third-party servers you half-use.

When it's the wrong call: when an official server already exists and is well-maintained.

The standards layer

The MCP spec lives at spec.modelcontextprotocol.io, versioned by date (latest stable as of writing: 2025-11-25). The spec covers JSON-RPC framing, transport, capability negotiation, the tools/resources/prompts primitives, and an OAuth 2.1 authorization profile.

What the spec covers well: wire format, capability discovery, resource subscription, and the recently formalized authorization model. The OAuth 2.1 profile supports several client-registration approaches rather than mandating one. That tiered approach lets a one-person setup register dynamically, lets a hosted server publish metadata documents, and still lets enterprise IDPs that disable DCR ship statically issued credentials.

What the spec does not yet solve: trace context propagation, structured error taxonomy (errors are free-form strings), rate limiting, tool versioning, and sandboxing.

Adoption is broad on paper, but interoperability is per-feature. Verified end-to-end as of 2026-04-28:

HostStdioStreamable HTTPOAuth 2.1Prompts as commandsResourcesTool Search / deferred load
Claude Codeverifiedverifiedverifiedverified (/mcp__server__prompt)verifiedverified (default-on)
Cursorverifiedverified for SSE/HTTPpartial; some flows require manual client IDnot surfaced as slash commandslimitedhost-specific eager load
ChatGPT desktop / connectorsnot applicablebeta, plan-gatedprovider-managednot surfacednot surfacednot exposed
Gemini CLIverified for stdionot personally testednot personally testednot surfacednot personally testedhost-specific
VS Code Copilot Chatverified for stdioverifiedverifiedpartialpartialhost-specific

A stdio server with cleanly-described tools and no resource or prompt surface is the closest thing to write-once-run-anywhere across these hosts. The moment a server leans on prompts-as-commands, resource subscriptions, or assumes the host will defer tool loading, behavior diverges. Plan for the subset, test the combinations a project actually depends on.

Things nobody talks about

Deferred loading is a session-start optimization, not a session-long one. Tool Search lowers the inject at session start, but every tool the model actually pulls stays loaded for the rest of that session. Treat the savings as "cheaper to open a fresh session" rather than "cheaper to leave one running all day," and recycle sessions when the budget gets tight. Scope MCP servers per-project in .mcp.json rather than user-globally, and remove servers nobody actually invokes.

Stdio servers inherit your shell environment. When Claude Code spawns an MCP server as a subprocess, that subprocess gets the full environment: API keys exported in .zshrc, AWS credentials cached by the SDK, every secret in the shell. A malicious or compromised MCP server has the same access to local secrets that the launching user does. "Official" or "reference" is not a synonym for "safe": the reference filesystem server shipped a path-traversal class of vulnerability that was patched in a later release. Pin every server to a specific patched version, not latest. Read the source before adding a community server. For anything that touches the filesystem or network, run it in a container with explicit environment whitelisting.

MCP server tool schemas can break Claude's reasoning if they're sloppy. The model treats tool descriptions as instructions. A server with vague descriptions ("does stuff with the database") gets called wrong. A server with descriptions that contradict each other across tools ("get_user vs fetch_user vs read_user") makes the model thrash. Tool Search makes this worse, not better, because the search ranker is matching on those same descriptions. Good servers have one tool per intent with a sharp description; bad servers expose every API endpoint as a separate tool with copy-paste descriptions.

OAuth 2.1 servers expire tokens and Claude Code does not always re-prompt cleanly. When an HTTP MCP server's access token expires mid-session, the tool call fails with a 401, the model gets a generic error, and Claude Code typically does not auto-trigger a re-auth flow. The fix is claude mcp restart <name> or removing and re-adding the server. Document the re-auth procedure before it is needed.

The wrong move here is reaching for a long-lived API key to dodge re-auth pain. That trades a noisy failure for a persistent credential with broader blast radius. For unattended or shared use: a dedicated service account with the narrowest scopes, short-lived access tokens with refresh-token rotation, workload identity (OIDC federation from CI, SPIFFE/SPIRE for self-hosted), secrets injected at process start from a secret manager rather than committed to .env or .mcp.json, and a written revocation procedure. Long-lived API keys are a last resort.

Lock-in is asymmetric: portable for tool authors, sticky for tool consumers. If you write an MCP server, every major client supports it. If a Claude Code project depends on 10 specific MCP servers, switching to Cursor requires verifying each server still works, that auth flows are identical, and that the new client surfaces tool descriptions the same way. Keep the MCP server list small.

Implementation patterns

Pattern 1: project-scoped .mcp.json with pinned versions

{
  "mcpServers": {
    "github": {
      "command": "docker",
      "args": [
        "run", "-i", "--rm",
        "-e", "GITHUB_PERSONAL_ACCESS_TOKEN",
        "ghcr.io/github/github-mcp-server:v0.6.0"
      ],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "${GITHUB_PAT}"
      }
    },
    "fs": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/[email protected]",
        "${PROJECT_DATA_DIR:-./data}"
      ]
    }
  }
}

Pin versions explicitly; never use latest. Pass the GitHub PAT through an env var rather than inlining it. Claude Code's .mcp.json only documents ${VAR} and ${VAR:-default} expansion (see code.claude.com/docs/en/mcp#environment-variable-expansion-in-mcpjson), so VS Code-style placeholders like ${workspaceFolder} will silently fail. Restrict the filesystem server to a subdirectory rather than the project root.

Pattern 2: minimal custom Python MCP server

import os
import uuid

import httpx
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("internal-tools")

@mcp.tool()
async def get_user_status(user_id: str) -> dict:
    """Look up a user's current account status by ID.

    Args:
        user_id: The internal user UUID (not email).

    Returns:
        Dict with keys: status (active|suspended|deleted),
        plan (free|pro|enterprise), created_at (ISO 8601).
    """
    try:
        uuid.UUID(user_id)
    except ValueError:
        raise ValueError("user_id must be a valid UUID")

    async with httpx.AsyncClient(timeout=10.0) as client:
        r = await client.get(
            f"https://internal.example.com/users/{user_id}",
            headers={"Authorization": f"Bearer {os.environ['INTERNAL_API_TOKEN']}"},
        )
        r.raise_for_status()
        return r.json()

if __name__ == "__main__":
    mcp.run()

The docstring becomes the tool description the model sees. Sharp, specific docstrings produce better tool selection. The uuid.UUID(user_id) check rejects malformed IDs before they get interpolated into the URL path. Pin mcp and httpx. Working examples at github.com/MPIsaac-Per/agentinfra-examples.

Pattern 3: minimal streamable HTTP skeleton (auth not included)

The TypeScript SDK ships a StreamableHTTPServerTransport. The skeleton, adapted from github.com/modelcontextprotocol/typescript-sdk/blob/main/docs/server.md (verified 2026-04-28):

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import express from "express";
import { z } from "zod";

const server = new McpServer(
  { name: "team-tools", version: "1.0.0" },
  { capabilities: { tools: {} } }
);

server.tool(
  "search_internal_docs",
  "Search the team knowledge base. Returns top 5 results.",
  { query: z.string() },
  async ({ query }, { authInfo }) => {
    if (!authInfo?.subject) {
      throw new Error("unauthenticated");
    }
    const results = await searchDocs(query, {
      actor: authInfo.subject,
      scopes: authInfo.scopes ?? [],
      tenantId: authInfo.tenantId,
    });
    return { content: [{ type: "text", text: JSON.stringify(results) }] };
  }
);

const transport = new StreamableHTTPServerTransport({
  sessionIdGenerator: undefined,
});
await server.connect(transport);

const app = express();
app.use(express.json());

app.post("/mcp", requireBearerToken, enforceOriginAllowlist, rateLimitPerSubject, async (req, res) => {
  await transport.handleRequest(req, res, req.body);
});

app.listen(3000);

The skeleton is stateless. zod is a peer dependency. express.json() is required. Pin the SDK version; the transport API moved between SSE and streamable HTTP revisions.

What is deliberately NOT in this snippet, and why it is not safe to deploy as-is for multi-user use:

  1. OAuth 2.1 token validation. requireBearerToken is a placeholder. A real implementation validates the incoming JWT against the authorization server's JWKS, checks iss, aud, exp, and populates authInfo (subject, scopes, tenant).
  2. Per-user authorization context propagated into the tool. searchDocs takes actor, scopes, and tenantId so the tenant boundary is enforced inside the search call. A tool that ignores authInfo and runs as a single privileged service account is the most common multi-tenant data-leak shape I have seen.
  3. Origin allowlist and CSRF posture. MCP HTTP servers reachable from a developer's machine are reachable from any page that machine loads.
  4. Rate limiting per authenticated subject, not per IP.
  5. Audit logging of tool calls. Subject, tool name, argument hash, latency, outcome.

The skeleton on its own is appropriate only for single-user localhost binding (app.listen(3000, "127.0.0.1")) during development. Anything reachable from another machine needs the full stack.

Decision framework

The honest answer to "should I use MCP for this?" depends on three questions in order.

1. Is the work shared across multiple agent clients or just Claude Code?

  • Multiple clients: MCP is the right choice. Portability pays.
  • Just Claude Code: prefer a custom slash command or hook for one-off scripts. Use MCP only when the tool needs to stay loaded across the session and be model-callable.

2. Does an official vendor-published server exist?

  • Yes, well-maintained: use it. Pin the version.
  • Yes, but stale or buggy: write your own.
  • No, but a community server exists: read the source, audit the auth handling, pin a version. Treat it as code you're running with shell privileges.
  • No: write your own.

3. Is the integration cost worth it for the frequency of use?

  • Daily use, central to your workflow: yes.
  • Weekly or occasional: probably not as a global server. Project-scope it or write a slash command.
  • One-off scripts: never.

Where the protocol is going: the 2025-11 spec revision settled OAuth and HTTP transport, which were the biggest gaps. The next round needs tool versioning, trace context, and a real error taxonomy.

What I would bet on: MCP becomes the equivalent of LSP for AI agents. Not glamorous, mostly invisible, the thing every serious tool eventually speaks. The agents that win are the ones with the cleanest MCP host implementations. Claude Code's MCP host is currently among the best, which is the real reason to invest in this stack now.

What I would avoid: building a deep dependency on any single hosted MCP gateway. The portability story is at the protocol level, not the gateway level. Run your own servers, or use vendor-official ones.