In November 2024, Anthropic shipped an open-source spec called the Model Context Protocol. The original announcement framed it as a small thing — a way to standardize how Claude connects to local tools and data sources. Fifteen months later it has become the de facto integration layer between LLMs and the rest of software. Every major lab supports it. Cursor, Replit, VS Code Copilot, and ChatGPT consume from the same server registry. There are over 10,000 public MCP servers. SDK downloads have grown 4,750% to 97 million per month.
If you build LLM applications in 2026 and you have not yet picked up MCP, this article is the catch-up. Below is the architecture, the spec, the server and client features, how to actually build an MCP server in your stack, the production considerations, the security model, and what the protocol does and does not solve.
The Short Version: What MCP Actually Is
MCP is a client-server protocol. The client is whatever wraps the LLM — Claude Desktop, ChatGPT, Cursor, your custom agent. The server exposes tools, resources, or prompts. They communicate over JSON-RPC 2.0 using one of two transports: STDIO for local servers, HTTP (with Server-Sent Events for streaming) for remote ones.
The thing to understand: MCP is not a new agent framework. It does not replace LangChain, LlamaIndex, or any of the orchestration libraries. It does not run your prompts. It does not host your LLM. What it does is standardize the integration layer — the boundary between an LLM-driven agent and the outside world. Before MCP, every LLM application had to build its own glue code for every external system: GitHub, Slack, Postgres, Linear, your file system. After MCP, you build the server once and any MCP-compatible client can use it.
That's why the adoption curve looks the way it does. It's a protocol-level network effect.
A Brief Timeline (Nov 2024 – May 2026)
- Nov 2024: Anthropic open-sources MCP alongside Claude Desktop. Initial spec covers STDIO transport, tools, resources, and prompts.
- Mar 2025: OpenAI officially adopts MCP across the ChatGPT desktop app and the OpenAI Agents SDK. Cross-vendor support becomes real.
- Apr 2025: Google DeepMind announces MCP support for Gemini. By summer 2025, every major lab is on the protocol.
- Nov 2025: Spec v2025-11-25 ships, adding streaming HTTP transport, server-initiated sampling, elicitation, and richer schema validation.
- Dec 2025: Anthropic donates MCP to the Agentic AI Foundation (a Linux Foundation directed fund co-founded by Anthropic, Block, and OpenAI). Governance moves to a vendor-neutral body.
- Jan 2026: MCP Apps ships. Tools can now return interactive UI components (forms, charts, dashboards, buttons) that render directly inside the conversation. MCP is no longer just a tool-calling protocol — it's a UI protocol for agents.
- May 2026: Over 10K public servers, 97M monthly SDK downloads, and MCP is the default integration layer for nearly every new agent built in production.
Architecture: Clients, Servers, Transports
The protocol has three actors and two transports. Get this picture clear and the rest of the spec is easy.
Hosts are user-facing applications. Claude Desktop. ChatGPT. Cursor. Replit. Your custom agent. The host owns the LLM context and the user-experience layer.
Clients are MCP-aware components inside the host. A host typically runs one client per connected server. The client handles message serialization, lifecycle (initialize, list capabilities, shut down), and routes server features into the host's runtime.
Servers expose capabilities (tools, resources, prompts) to clients. Servers can be local processes (spawned by the host) or remote services accessed over HTTP. A single server can expose any combination of features.
Transports are how clients and servers talk:
- STDIO: The server is a child process. The client writes JSON-RPC messages to its stdin and reads responses from stdout. Used for local servers and developer tooling. Lowest-latency, simplest to develop against.
- HTTP (with SSE for streaming): The server is a remote service. The client sends JSON-RPC messages over HTTP POSTs and reads streaming responses over Server-Sent Events. Used for hosted servers, multi-tenant deployments, and any case where you can't run a local process.
Server Features: Tools, Resources, Prompts
The three core feature types a server exposes. Most servers expose one or two of these; only the most complete servers expose all three.
Functions the LLM invokes
Action-shaped operations. search_jira, send_slack_message, create_pr. The LLM decides when to call; the server executes.
Data the LLM reads
Read-only content the server exposes by URI. Documents, database rows, Notion pages. The host decides what to include in context.
Reusable templates
Prompt templates the server provides for the user or LLM to invoke. Useful for canonical workflows: "summarize this bug ticket," "draft a PR description."
Each feature has a strict JSON schema. The client lists them at session start with tools/list, resources/list, prompts/list. The LLM then sees a structured catalog and can call into it.
Client Features: Sampling, Roots, Elicitation
Less talked about, but important for non-trivial integrations. Clients can also expose features back to servers.
- Sampling. The server requests an LLM completion through the client. This lets a server-side workflow include an LLM call without the server needing its own LLM credentials. Useful for "summarize this customer record" inside a CRM server.
- Roots. The client tells the server which directories or URIs the user has authorized access to. A filesystem server can only read paths under approved roots. This is part of the security model — not a server-side decision.
- Elicitation. Added in v2025-11-25. The server prompts the user for additional input mid-conversation ("Which Jira project?", "Confirm deletion?"). The host renders the prompt, gets a response, and sends it back to the server.
Building Your First MCP Server (TypeScript)
The simplest path to a working server. Below is a TypeScript example that exposes a single tool, using the official @modelcontextprotocol/sdk.
Wire it into Claude Desktop, Cursor, or any MCP-compatible host by adding an entry to the host's MCP configuration with the command to spawn the server. The host will run it, list its tools, and surface them to the LLM. Total time from npm init to a working tool call: about 15 minutes for a first server.
SDK equivalents in Python, Java, C#, Go, and Rust ship from the same official org. The Python SDK is the most popular for production servers; TypeScript is most popular for local developer tooling.
MCP Apps: When Tools Return UI
The January 2026 spec extension that converted MCP from a tool-calling protocol into a UI protocol. MCP Apps let a server return interactive UI components — not just text — that render directly inside the host's conversation surface.
What this looks like in practice: a CRM server can return a structured customer card with action buttons. A scheduling server can return an interactive calendar widget. A finance server can return a chart with a "drill down" button. The LLM sees a tool response; the user sees a UI.
Pre-MCP-Apps, agents were limited to text-based interfaces. Post-MCP-Apps, the conversation becomes a programmable surface. This is the largest UX shift in LLM applications since streaming responses became standard — and it's the reason serious product teams started taking MCP seriously rather than treating it as Anthropic's internal protocol.
Security: The Things to Get Right
MCP gives servers access to LLM contexts and, often, to user data. The security model is partially specified and partially the host's responsibility. The things you actually need to get right when building or deploying:
Resources returned by an MCP server can contain instructions targeted at the LLM. An untrusted document loaded as a resource can hijack agent behavior. Validate sources, sanitize untrusted content, and treat resource content as user input, not system input.
- Scope server access narrowly. Use roots to restrict filesystem access. Use API keys with least-privilege scopes for HTTP-backed servers. Audit what tools can actually do — a "delete file" tool that takes an unvalidated path is a footgun.
- Treat resources as untrusted. Same risk model as user-uploaded content. Use a sanitization step before injecting resource content into the LLM context.
- Authenticate remote servers. HTTP-transport servers need real auth. OAuth 2.0 with PKCE is the standard pattern emerging in 2026. Don't ship a remote server with bearer tokens shared across users.
- Rate-limit at the server layer. LLMs can call tools in tight loops. A reasonable retry policy plus per-session rate limits prevents one runaway agent from exhausting your downstream APIs.
- Log everything. Tool calls are the agent's actions. Log them with the same rigor you'd apply to API calls in a regulated system. This matters more for FDE-deployed agents — see our FDE boom piece for why production AI observability is now the differentiator.
What MCP Does Not Solve
The protocol is narrow on purpose. Three things it does not address that you still need to design for:
- Orchestration. MCP doesn't decide which tool to call when. That's the agent loop's job — LangChain, LangGraph, the OpenAI Agents SDK, or your own loop. MCP just standardizes the contract.
- Evaluation. MCP doesn't tell you whether your agent is doing the right thing. You still need eval frameworks — see our LLM evaluation guide and AI agent evaluation guide.
- State. The protocol is stateful per session but doesn't define how state persists across sessions. Memory, conversation history, vector-store retrieval — all out of scope. Pair MCP with a memory layer if you need cross-session persistence.
Career Angle: Why MCP Fluency Matters in 2026 Hiring
If you're an AI engineer reading this, the practical implication: MCP fluency has become a real interview signal at AI companies in 2026. Especially for Forward Deployed Engineer roles, where customer-environment integrations are the work itself. We've seen MCP-specific questions in interview loops at multiple frontier and mid-stage AI companies in our Culture Directory.
If you're a senior engineer who can sketch a clean MCP server design for a customer's data stack in a 60-minute interview, you're meaningfully more hireable than a peer who can't. The skill is also relatively new, which means the bar to "above average" is still low — spending a weekend building a real MCP server (a Postgres reader, a Linear writer, a GitHub PR creator) puts you ahead of most candidates in 2026.
For broader context on the AI skill stack employers are paying for, our AI Engineer Salary Guide by Level and how to become an AI engineer in 2026 articles cover the full landscape.
Frequently Asked Questions About MCP
Find AI engineering roles working on MCP, agents, and production AI
Browse 224+ open Forward Deployed Engineer, AI Engineer, and Applied Research roles across Anthropic, OpenAI, Cursor, and 39+ other AI companies.
Browse AI Jobs → AI Skills Hub →