MCP Architecture Guide: How Model Context Protocol Works

Q: Can MCP servers call back into the client for LLM sampling?

Yes, the `sampling` capability lets MCP servers request LLM completions from the host through `sampling/createMessage`. This bidirectional pattern enables servers to delegate cognitive work—like summarizing search results or analyzing error messages—without embedding their own LLM API keys. Servers declare sampling need during initialization; clients that support it can provide completions from their configured models (Claude, GPT, Gemini, etc.).

What Is Model Context Protocol and Why Does It Matter? #

Model Context Protocol (MCP) is an open standard that lets AI systems access external tools, data sources, and workflows through a unified client-server architecture. Anthropic introduced MCP on November 25, 2024, then donated it to the Linux Foundation's Agentic AI Foundation in December 2025—cementing its role as the interoperability layer for AI agent ecosystems.

MCP solves what Anthropic calls the N×M integration problem: previously, connecting N different LLM applications to M different data sources required N×M custom integrations. Every new data source needed bespoke connectors for every AI client. MCP inverts this—write one server, connect any client.

The Core Innovation #

Before MCP, integrating Claude with your codebase meant building a custom tool layer. Integrating it with Slack, Notion, or your CRM meant separate auth flows, separate API wrappers, separate error handling. MCP standardizes these integrations at the protocol level:

Before MCP	After MCP
Custom API wrappers per integration	Declarative server capabilities
Proprietary tool formats per LLM	Universal `tools/resources/prompts` primitives
Fragmented auth and transport	Standardized JSON-RPC 2.0 transport
Vendor-locked implementations	Open standard governed by Linux Foundation
One-off integration code	Reusable, composable server components

Why 2026 Is the Inflection Point #

Three converging trends make MCP unavoidable for serious AI automation:

Ecosystem maturation: The official MCP server registry now lists 200+ community servers covering everything from PostgreSQL to Shopify to Blender. The long tail of integrations is filling in fast.
Client adoption: Cursor, Claude Desktop, Claude Code, and a growing list of AI applications now ship with built-in MCP client support. The "any client" promise is becoming reality.
Enterprise requirements: Production AI deployments need standardized observability, security boundaries, and multi-agent orchestration. MCP's architecture provides the scaffolding for all three.

What MCP Actually Does #

At its simplest, MCP lets an AI agent:

Call tools: Execute functions with structured inputs/outputs (search a database, send a Slack message, deploy to Vercel)
Access resources: Read contextual data that changes over time (current stock prices, your filesystem, Jira ticket status)
Use prompts: Invoke templated workflows ("analyze this codebase for security issues," "onboard a new hire")

The protocol handles capability negotiation, request routing, error propagation, and lifecycle management—so you build the business logic, not the plumbing.

From My Experience #

I've implemented MCP servers for three production systems in the past quarter: a proprietary CRM connector for a sales automation pipeline, a Notion workspace manager for content operations, and a custom deployment orchestrator for web projects. In each case, the time-to-first-integration dropped by roughly 60% compared to building custom tool layers. The standardization pays for itself immediately when you need to swap or add AI clients.

The Client-Host-Server Architecture Explained #

MCP implements a three-layer architecture where hosts manage clients, clients maintain stateful connections to servers, and servers expose capabilities through a standardized protocol. This design separates connection lifecycle management from capability exposure, enabling complex multi-server deployments without coupling concerns.

The Three Architectural Layers #

┌─────────────────────────────────────────────────────────┐
│                      HOST APPLICATION                    │
│    (Claude Desktop, Cursor, Claude Code, Custom App)   │
│                                                         │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐     │
│  │   CLIENT 1  │  │   CLIENT 2  │  │   CLIENT N  │     │
│  │  (stdio)    │  │   (HTTP)    │  │  (stdio)    │     │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘     │
│         │                │                │             │
│         └────────────────┼────────────────┘             │
│                          │                              │
└──────────────────────────┼──────────────────────────────┘
                           │
           ┌───────────────┼───────────────┐
           │               │               │
    ┌──────┴──────┐ ┌──────┴──────┐ ┌──────┴──────┐
    │  SERVER 1   │ │  SERVER 2   │ │  SERVER 3   │
    │  Filesystem │ │   SQLite    │ │   Custom    │
    │   Server    │ │   Server    │ │   Server    │
    └─────────────┘ └─────────────┘ └─────────────┘

Hosts #

Hosts are LLM applications that initiate and manage connections. A host:

Instantiates multiple clients (one per server)
Handles user consent and permission flows
Routes requests between the LLM and appropriate clients
Manages server lifecycle (start, stop, restart)

Examples: Claude Desktop, Cursor IDE, Claude Code CLI, or your custom Next.js application.

Clients #

Clients are connection managers that establish stateful 1:1 sessions with servers. Key characteristics:

One client connects to exactly one server
Maintains session state including capability negotiation results
Handles transport-specific concerns (stdio pipes, HTTP connections)
Implements the JSON-RPC protocol layer

Clients are the glue—they don't expose capabilities themselves, but they translate between host expectations and server realities.

Servers #

Servers are capability providers that expose tools, resources, and prompts. Servers:

Declare their capabilities during initialization
Handle JSON-RPC requests from clients
Execute business logic (database queries, API calls, computations)
Return structured responses or errors

Servers are stateless with respect to business logic but participate in the stateful session protocol.

The 1:1 Connection Model #

MCP enforces a strict 1:1 relationship between clients and servers. When a host wants to connect to three different capability sources, it spins up three separate client instances:

Connection	Client Instance	Server	Purpose
1	Client A	Filesystem Server	Read/write local files
2	Client B	SQLite Server	Query embedded database
3	Client C	Custom API Server	Internal business logic

This isolation prevents cross-contamination. If your SQLite server crashes, it doesn't affect your filesystem connection. If one server has elevated permissions, its scope is limited to that connection's purpose.

Stateful Session Management #

Unlike REST's stateless request-response model, MCP connections are stateful sessions:

Initialization: Client and server exchange capability declarations
Operation: Bidirectional JSON-RPC messages flow over the transport
Termination: Graceful shutdown or connection drop handling

During initialization, the server announces:

Which tools it provides (with JSON schemas)
Which resources are available (with URI patterns)
Which prompts it supports (with argument schemas)

The client announces what it can offer back:

sampling: Can request LLM completions from the host
roots: Can provide filesystem root scoping
experimental: Feature flags for protocol extensions

How This Differs from Traditional API Architecture #

Aspect	REST API	MCP Protocol
Connection model	Stateless requests	Stateful sessions
Capability discovery	OpenAPI/static docs	Runtime negotiation
Transport	HTTP/1.1 or HTTP/2	stdio, HTTP, WebSockets
Error handling	HTTP status codes	JSON-RPC error objects
Bidirectional	Webhooks/polling	Native in protocol
Tool exposure	Custom endpoints	Standardized `tools/call`

The stateful session model is crucial for AI agent workflows. When Claude asks your filesystem server to "read the last 5 commits and summarize them," that request happens within an established session where permissions, working directory, and context are already negotiated. No need to pass auth tokens and scope parameters with every call.

JSON-RPC 2.0: The Wire Protocol Beneath MCP #

MCP uses JSON-RPC 2.0 as its wire protocol—a lightweight, bidirectional messaging format that supports requests, responses, and notifications over any transport. This choice prioritizes simplicity, debuggability, and universal parser availability over the complexity of binary protocols.

Why JSON-RPC Over Alternatives #

The protocol selection reveals MCP's design priorities:

Protocol	Why Not Used	MCP's Alternative
REST/HTTP	Stateless, verb/noun mismatch with tool semantics	Stateful JSON-RPC with explicit method names
gRPC	Binary format, schema compilation, stub complexity	Human-readable JSON, dynamic capability negotiation
GraphQL	Query complexity overkill for tool calls	Simple request/response with structured parameters
WebSocket raw	No framing, missing request/response correlation	JSON-RPC over WebSocket transport

JSON-RPC 2.0 provides exactly what MCP needs:

Request-response correlation via id fields
Bidirectional messaging (servers can call client capabilities)
Human-readable debugging (inspect messages in dev tools)
Minimal parser requirements (every language has JSON)

The Four Message Types #

MCP implementations exchange four JSON-RPC message structures:

1. Request (Client → Server or Server → Client) #

{
  "jsonrpc": "2.0",
  "id": "req_001",
  "method": "tools/call",
  "params": {
    "name": "query_database",
    "arguments": {
      "sql": "SELECT * FROM users WHERE active = true",
      "limit": 10
    }
  }
}

Required fields:

jsonrpc: Always "2.0"
id: String or integer, unique per sender, non-null
method: String naming the operation
params: Object or array of arguments

2. Success Response #

{
  "jsonrpc": "2.0",
  "id": "req_001",
  "result": {
    "content": [
      {
        "type": "text",
        "text": "[{\"id\": 1, \"name\": \"Alice\"}, ...]"
      }
    ],
    "isError": false
  }
}

The result structure varies by method. Tool calls return content arrays with typed elements (text, images, resource references).

3. Error Response #

{
  "jsonrpc": "2.0",
  "id": "req_001",
  "error": {
    "code": -32602,
    "message": "Invalid params: missing required 'sql' argument",
    "data": {
      "details": "The 'sql' parameter is required for query_database tool"
    }
  }
}

MCP uses standard JSON-RPC error codes with domain-specific extensions:

Code	Meaning	MCP Context
`-32700`	Parse error	Invalid JSON received
`-32600`	Invalid request	Malformed JSON-RPC structure
`-32601`	Method not found	Server doesn't support this method
`-32602`	Invalid params	Arguments fail schema validation
`-32603`	Internal error	Server crashed during execution
`-32000` to `-32099`	Server errors	Implementation-specific issues

4. Notification (One-way, no response expected) #

{
  "jsonrpc": "2.0",
  "method": "notifications/progress",
  "params": {
    "progressToken": "tok_123",
    "progress": 50,
    "total": 100
  }
}

Notifications omit the id field. MCP uses these for:

Progress updates during long-running tool calls
Resource change notifications ("file was modified")
Logging and telemetry streaming

Method Namespacing Convention #

MCP methods follow a hierarchical namespace pattern:

Namespace	Purpose	Example Methods
`initialize/*`	Session lifecycle	`initialize`, `initialized`
`tools/*`	Tool operations	`tools/list`, `tools/call`
`resources/*`	Resource operations	`resources/list`, `resources/read`
`prompts/*`	Prompt operations	`prompts/list`, `prompts/get`
`notifications/*`	One-way updates	`notifications/progress`, `notifications/message`
`sampling/*`	Client capabilities (server → client)	`sampling/createMessage`
`roots/*`	Filesystem scoping	`roots/list`

This namespacing lets clients route messages to appropriate handlers and enables protocol extensions without collision.

Batching and Transport Considerations #

JSON-RPC supports batch requests—an array of request objects sent together—but MCP implementations typically use single messages for clarity. The protocol doesn't mandate batching, and most MCP SDKs don't implement it, prioritizing simpler streaming semantics instead.

Transport layers handle the actual byte movement:

stdio: Messages delimited by newlines over stdin/stdout
HTTP: POST requests with JSON-RPC bodies, or WebSocket frames
WebSocket: Text frames containing JSON-RPC objects

The transport is swappable; the JSON-RPC protocol remains constant.

Debugging MCP Messages #

One significant advantage of JSON-RPC is inspectability. When an MCP integration fails, you can:

Enable debug logging in your client/host
Capture raw JSON-RPC messages
Validate against the spec using standard JSON tools
Replay requests with curl or similar tools

This transparency accelerates development. When I built my first custom MCP server for a proprietary CMS, being able to pipe stdin/stdout to jq and see exactly where my response structure diverged from the spec saved hours of guesswork.

MCP Server Capabilities: Tools, Resources, and Prompts #

MCP servers expose three primary capability primitives: tools (executable functions), resources (addressable data), and prompts (templated workflows). These primitives map cleanly to LLM needs—agents need to do things (tools), know things (resources), and follow structured procedures (prompts).

Tools: Executable Functions #

Tools are the most commonly used MCP primitive. They're executable functions with typed inputs and outputs that AI agents can invoke. Think of tools as the API surface your agent can call.

A tool declaration includes:

{
  "name": "search_customer_database",
  "description": "Query the customer database by email, name, or ID. Returns matching records with purchase history.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "Email, name fragment, or customer ID to search for"
      },
      "includeHistory": {
        "type": "boolean",
        "description": "Whether to include purchase history in results",
        "default": true
      }
    },
    "required": ["query"]
  }
}

Key tool characteristics:

Aspect	Specification	Best Practice
Naming	snake_case or camelCase	Use verb-noun patterns (`send_email`, `create_record`)
Description	1-3 sentences	Be specific about what the tool does and what it returns
Parameters	JSON Schema	Include types, descriptions, defaults, and constraints
Required fields	Explicit array	Mark only truly-required params as required
Returns	Content array	Support text, images, and resource references

Tool execution happens through the tools/call method:

// Request
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "search_customer_database",
    "arguments": {
      "query": "alice@example.com",
      "includeHistory": true
    }
  }
}

// Response
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "Customer found: Alice Smith (ID: 12345). Last purchase: 2026-04-28."
      }
    ],
    "isError": false
  }
}

Resources: Addressable Data #

Resources are contextual data sources that change over time and can be referenced by URI. Unlike tools (which do things), resources are things—files, database records, API endpoints, live streams.

Resources are identified by URIs with semantic schemes:

URI Scheme	Example	Meaning
`file://`	`file:///home/user/project/src/main.ts`	Local filesystem path
`https://`	`https://api.example.com/users/123`	HTTP-accessible data
`db://`	`db://customers/12345`	Database record reference
`git://`	`git://repo/main/src/index.ts`	Git repository content
Custom	`internal://workflow/run-789`	Application-specific resources

Resource declaration includes metadata, not content:

{
  "uri": "file:///home/user/project/README.md",
  "name": "Project README",
  "description": "Main documentation file for the project",
  "mimeType": "text/markdown"
}

Agents access resources via resources/read:

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "resources/read",
  "params": {
    "uri": "file:///home/user/project/README.md"
  }
}

Resources support subscriptions via resources/subscribe—the server pushes notifications when the resource changes. This enables live-updating contexts like "monitor this log file" or "track this Jira ticket status."

Prompts: Templated Workflows #

Prompts are pre-defined templates that combine instructions, resources, and parameter placeholders into reusable workflows. They're the MCP primitive closest to traditional "system prompts" or "prompt engineering."

A prompt declaration:

{
  "name": "code_review_security",
  "description": "Analyze code files for common security vulnerabilities (SQL injection, XSS, path traversal)",
  "arguments": [
    {
      "name": "file_path",
      "description": "Path to the file to analyze",
      "required": true
    },
    {
      "name": "language",
      "description": "Programming language (auto-detected if not specified)",
      "required": false
    }
  ]
}

When invoked via prompts/get, the server returns the rendered prompt:

{
  "jsonrpc": "2.0",
  "id": 3,
  "result": {
    "description": "Security review prompt for /src/auth/login.ts",
    "messages": [
      {
        "role": "user",
        "content": {
          "type": "text",
          "text": "Please analyze the following TypeScript code for security vulnerabilities. Focus on: SQL injection, XSS, path traversal, and insecure authentication patterns.\n\nFile: /src/auth/login.ts\n\n[resource content inserted here]"
        }
      }
    ]
  }
}

Prompts are powerful for:

Standardizing recurring analysis tasks ("review this PR," "summarize this meeting")
Packaging institutional knowledge ("onboard a new hire to our API")
Creating reusable agentic workflows ("debug this error pattern")

Comparing the Three Primitives #

Primitive	Use When	Example
Tool	The agent needs to take an action	Send email, deploy code, create ticket
Resource	The agent needs to read current data	File contents, API response, database row
Prompt	The agent needs a structured procedure	Security review, onboarding guide, analysis template

Real-World Server Examples #

The official MCP reference servers demonstrate these primitives:

Filesystem Server (@modelcontextprotocol/server-filesystem):

Tools: read_file, write_file, list_directory, search_files
Resources: file:// URIs for any accessible file
Prompts: None (pure tool/resource server)

SQLite Server (@modelcontextprotocol/server-sqlite):

Tools: query (execute SQL), execute (DDL statements)
Resources: sqlite:// URIs for table introspection
Prompts: Query-building templates

Brave Search Server (@modelcontextprotocol/server-brave-search):

Tools: brave_web_search, brave_local_search
Resources: None (pure tool server)
Prompts: Research workflow templates

When designing your own MCP servers, start with tools—they're the most flexible and most frequently used. Add resources when your server manages addressable data that agents need to reference repeatedly. Add prompts when you find yourself repeating the same multi-step instruction patterns to your AI.

Client Capabilities: Sampling, Roots, and Elicitation #

MCP is bidirectional: while servers expose tools/resources/prompts, clients can offer sampling (LLM access), roots (filesystem scoping), and elicitation (user input gathering) back to servers. This symmetry enables sophisticated agentic patterns where servers delegate cognitive work to the host's LLM.

Sampling: Server-Initiated LLM Access #

Sampling lets an MCP server request LLM completions from the host. This is the protocol's most powerful feature—it allows servers to extend themselves with AI capabilities without embedding their own API keys or model logic.

A server requests sampling via sampling/createMessage:

{
  "jsonrpc": "2.0",
  "id": "samp_001",
  "method": "sampling/createMessage",
  "params": {
    "messages": [
      {
        "role": "user",
        "content": {
          "type": "text",
          "text": "Summarize the following customer feedback into 3 key themes:\n\n[feedback content here]"
        }
      }
    ],
    "modelPreferences": {
      "hints": [{"name": "claude-3-5-sonnet"}],
      "intelligencePriority": 0.7,
      "speedPriority": 0.3
    },
    "systemPrompt": "You are a data analysis assistant. Be concise."
  }
}

The host responds with the LLM output:

{
  "jsonrpc": "2.0",
  "id": "samp_001",
  "result": {
    "model": "claude-3-5-sonnet-20241022",
    "stopReason": "end_turn",
    "role": "assistant",
    "content": {
      "type": "text",
      "text": "1. Shipping delays (45% of mentions)\n2. Product quality concerns (30%)\n3. Positive customer service experiences (25%)"
    }
  }
}

Sampling enables powerful patterns:

Pattern	How It Works	Example Use Case
Delegating analysis	Server fetches raw data, asks LLM to analyze	Database query → natural language summary
Multi-step reasoning	Server performs action, LLM evaluates result	Test execution → failure analysis → fix suggestion
Content transformation	Server provides content, LLM reformats	Markdown → HTML, raw notes → structured report
Error interpretation	Server encounters error, LLM explains	Stack trace → human-readable explanation

The modelPreferences field lets servers express what they need without hardcoding model names. Priorities (intelligencePriority, speedPriority, costPriority) guide the host's model selection when multiple options exist.

Roots: Filesystem Scoping #

Roots let servers discover which filesystem locations they should operate within. When a client declares roots capability, the server can query roots/list to understand its sandbox boundaries.

Request:

{
  "jsonrpc": "2.0",
  "id": 4,
  "method": "roots/list"
}

Response:

{
  "jsonrpc": "2.0",
  "id": 4,
  "result": {
    "roots": [
      {
        "uri": "file:///home/user/projects/my-app",
        "name": "Current Project"
      },
      {
        "uri": "file:///home/user/templates",
        "name": "Shared Templates"
      }
    ]
  }
}

This enables scoped filesystem access without hardcoding paths. A filesystem server configured with roots will:

Only show files within root directories
Reject operations outside roots (unless explicitly permitted)
Present root names in UI for clarity

Roots are particularly valuable for:

Multi-project hosts: Cursor working on several repositories simultaneously
Sandboxed environments: Restricting server access to specific directories
Workspace awareness: Letting servers understand project structure

Elicitation: Gathering Additional Information #

Elicitation allows servers to request additional information from the user through the host's UI. When a server needs clarification or missing parameters, it can ask rather than fail.

MCP's elicitation mechanism uses notifications/message with a request for user interaction:

{
  "jsonrpc": "2.0",
  "method": "notifications/message",
  "params": {
    "level": "info",
    "data": "Searching for customer record. If not found, should I create a new one?"
  }
}

While the core protocol supports basic message notifications, sophisticated elicitation (like prompting for form input or confirmation dialogs) is typically handled through:

Tool responses that signal incomplete data
Follow-up tool calls that gather the missing pieces
Host-specific UI integrations

Client Capability Negotiation #

During initialization, clients announce which capabilities they support:

{
  "jsonrpc": "2.0",
  "id": 0,
  "method": "initialize",
  "params": {
    "protocolVersion": "2024-11-05",
    "capabilities": {
      "sampling": {},
      "roots": {
        "listChanged": true
      },
      "experimental": {}
    },
    "clientInfo": {
      "name": "cursor-ide",
      "version": "1.96.0"
    }
  }
}

Servers adapt their behavior based on available client capabilities:

If sampling is present, use LLM delegation for complex tasks
If roots is present, query for filesystem scope
If neither, fall back to simpler patterns

This negotiation makes MCP servers portable across different host environments—a server that uses sampling when available degrades gracefully in hosts that don't support it.

The Bidirectional Power #

The bidirectional design separates MCP from simpler tool-calling schemes. When your n8n workflow server needs to classify an incoming webhook payload, it doesn't need an OpenAI API key—it requests sampling from the host. When your database server needs to summarize query results, it delegates to the LLM rather than implementing its own summarization logic.

This architecture centralizes model management at the host level while distributing capability across specialized servers. The result: cleaner separation of concerns, simpler server implementations, and consistent model behavior across all tools.

Building an MCP Server: A Production-Ready Implementation #

Building an MCP server means implementing three core components: initialization handling, capability declaration, and method routing. The official TypeScript and Python SDKs handle protocol boilerplate, letting you focus on business logic.

Server Structure Overview #

Every MCP server follows this lifecycle:

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│  Transport  │────▶│ Initialize   │────▶│  Capability  │
│  Connected  │     │ Handshake    │     │  Exchange   │
└─────────────┘     └──────────────┘     └─────────────┘
                                                │
                                                ▼
┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│   Cleanup   │◀────│   Request    │◀────│  Operation  │
│  & Exit     │     │   Handling   │     │   Loop      │
└─────────────┘     └──────────────┘     └─────────────┘

TypeScript Implementation #

Here's a production-grade MCP server built with the official TypeScript SDK:

import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
  CallToolRequestSchema,
  ListToolsRequestSchema,
  ErrorCode,
  McpError
} from '@modelcontextprotocol/sdk/types.js';
import { z } from 'zod';

// Business logic: A simple CRM connector
class CRMConnector {
  private customers = new Map<string, any>();
  
  async searchCustomers(query: string): Promise<any[]> {
    return Array.from(this.customers.values())
      .filter(c => c.email.includes(query) || c.name.includes(query));
  }
  
  async getCustomer(id: string): Promise<any | null> {
    return this.customers.get(id) || null;
  }
}

// Schema validation with Zod
const SearchCustomersSchema = z.object({
  query: z.string().min(1).describe('Email or name to search for'),
  limit: z.number().optional().default(10)
});

const GetCustomerSchema = z.object({
  id: z.string().describe('Customer ID')
});

// Create server instance
const server = new Server(
  {
    name: 'crm-connector',
    version: '1.0.0'
  },
  {
    capabilities: {
      tools: {}  // We expose tools
    }
  }
);

const crm = new CRMConnector();

// Handler: List available tools
server.setRequestHandler(ListToolsRequestSchema, async () => {
  return {
    tools: [
      {
        name: 'search_customers',
        description: 'Search customers by email or name. Returns matching records.',
        inputSchema: {
          type: 'object',
          properties: {
            query: { type: 'string', description: 'Email or name to search for' },
            limit: { type: 'number', description: 'Maximum results (default: 10)' }
          },
          required: ['query']
        }
      },
      {
        name: 'get_customer',
        description: 'Get a single customer by ID',
        inputSchema: {
          type: 'object',
          properties: {
            id: { type: 'string', description: 'Customer ID' }
          },
          required: ['id']
        }
      }
    ]
  };
});

// Handler: Execute tool calls
server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const { name, arguments: args } = request.params;
  
  try {
    if (name === 'search_customers') {
      const validated = SearchCustomersSchema.parse(args);
      const results = await crm.searchCustomers(validated.query);
      
      return {
        content: [
          {
            type: 'text',
            text: JSON.stringify(results.slice(0, validated.limit), null, 2)
          }
        ],
        isError: false
      };
    }
    
    if (name === 'get_customer') {
      const validated = GetCustomerSchema.parse(args);
      const customer = await crm.getCustomer(validated.id);
      
      if (!customer) {
        return {
          content: [{ type: 'text', text: 'Customer not found' }],
          isError: true
        };
      }
      
      return {
        content: [
          { type: 'text', text: JSON.stringify(customer, null, 2) }
        ],
        isError: false
      };
    }
    
    throw new McpError(
      ErrorCode.MethodNotFound,
      `Unknown tool: ${name}`
    );
    
  } catch (error) {
    if (error instanceof z.ZodError) {
      return {
        content: [
          { type: 'text', text: `Validation error: ${error.errors.map(e => e.message).join(', ')}` }
        ],
        isError: true
      };
    }
    throw error;
  }
});

// Start server with stdio transport
async function main() {
  const transport = new StdioServerTransport();
  await server.connect(transport);
  console.error('CRM MCP server running on stdio');
}

main().catch(console.error);

Key Implementation Patterns #

Pattern	Implementation	Rationale
Schema validation	Zod for TypeScript, Pydantic for Python	Fail fast on invalid inputs, clear error messages
Error handling	Return `isError: true` for expected failures, throw for unexpected	Distinguish user errors from system errors
Content typing	Use `type: 'text'` for JSON, plan for `type: 'image'`	Future-proof for multimodal responses
Async operations	Always return Promises	MCP handles the async/await complexity
Resource cleanup	Implement signal handlers	Graceful shutdown on SIGTERM

Python Implementation #

The Python SDK follows similar patterns:

import asyncio
from mcp.server import Server
from mcp.server.stdio import stdio_server
from pydantic import BaseModel, Field

# Business logic
class CRMConnector:
    def __init__(self):
        self.customers = {}
    
    async def search_customers(self, query: str, limit: int = 10):
        results = [
            c for c in self.customers.values()
            if query.lower() in c.get('email', '').lower() 
            or query.lower() in c.get('name', '').lower()
        ]
        return results[:limit]

# Schema validation
class SearchCustomersInput(BaseModel):
    query: str = Field(..., description="Email or name to search for")
    limit: int = Field(default=10, description="Maximum results")

# Server setup
app = Server("crm-connector")

@app.list_tools()
async def list_tools():
    return [
        {
            "name": "search_customers",
            "description": "Search customers by email or name",
            "inputSchema": SearchCustomersInput.schema()
        }
    ]

@app.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "search_customers":
        validated = SearchCustomersInput(**arguments)
        crm = CRMConnector()
        results = await crm.search_customers(validated.query, validated.limit)
        return {
            "content": [{"type": "text", "text": str(results)}],
            "isError": False
        }
    
    raise ValueError(f"Unknown tool: {name}")

async def main():
    async with stdio_server() as streams:
        await app.run(streams[0], streams[1])

if __name__ == "__main__":
    asyncio.run(main())

Deployment Considerations #

Environment	Transport	Configuration
Local dev	stdio	Package as CLI, invoke directly
Docker	stdio or HTTP	Container per server, compose orchestration
Kubernetes	HTTP	Sidecar pattern or dedicated pods
Serverless	HTTP	Lambda/Function with API Gateway
n8n	HTTP	Self-hosted n8n with MCP community node

For production deployments, I recommend:

Health check endpoint (for HTTP transports): GET /health returning 200 when ready
Graceful shutdown: Handle SIGTERM, finish in-flight requests
Structured logging: JSON logs with correlation IDs
Metrics exposure: Request counts, latency percentiles, error rates
Secret management: API keys via environment variables, never committed

Testing Your Server #

The MCP Inspector tool validates server compliance:

npx @modelcontextprotocol/inspector node dist/server.js

This opens a web UI where you can:

Browse available tools/resources/prompts
Invoke tools with custom parameters
Inspect raw JSON-RPC messages
Validate response formats

Always run the inspector before deploying—it's caught schema mismatches and missing error handling in every server I've built.

From My Workflow #

I keep a template repo with this exact structure for new MCP servers. The boilerplate (SDK imports, transport setup, error handling) stays constant. What changes: the business logic in the connector class, the Zod/Pydantic schemas, and the tool definitions. This separation lets me spin up new integrations in under an hour—whether it's connecting to a new API, wrapping an internal service, or exposing a database.

MCP Transport Layer: stdio vs HTTP vs WebSockets #

MCP abstracts the transport layer—servers and clients communicate via JSON-RPC regardless of whether bytes flow over stdio pipes, HTTP connections, or WebSocket frames. Each transport has distinct tradeoffs for security, latency, and deployment flexibility.

stdio: The Local Default #

stdio (standard input/output) is the default transport for local MCP servers. The host spawns the server process and communicates via stdin/stdout, with stderr reserved for logging.

┌─────────────┐         stdin          ┌─────────────┐
│    HOST     │ ◀───────────────────── │   SERVER    │
│  (Cursor)   │ ─────────────────────▶ │   Process   │
│             │         stdout         │             │
│             │ ─────────────────────▶ │             │
│             │         stderr (logs)  │             │
└─────────────┘                        └─────────────┘

stdio characteristics:

Aspect	Behavior	Implication
Lifecycle	Host manages process start/stop	Servers must start quickly and exit cleanly
Security	OS process isolation	Server runs with host's permissions
State	Connection dies with process	No reconnection, session is ephemeral
Debugging	Pipe through `jq`, inspect stderr	Easy to capture and replay messages
Deployment	Local-only	Cannot expose over network

stdio works best for:

Local filesystem servers
Development tools and linters
Single-user desktop applications
Servers that need OS-level sandboxing

HTTP: Remote and Scalable #

HTTP transport enables network-accessible MCP servers, supporting stateless request-response and SSE (Server-Sent Events) for server-to-client streaming.

HTTP transport supports two connection patterns:

Synchronous mode (simple request-response):

Client                    Server
  │    POST /message      │
  │ ───────────────────▶  │
  │    {jsonrpc: "2.0"...}│
  │                       │
  │    200 OK             │
  │ ◀───────────────────  │
  │    {result: {...}}     │

Streaming mode (with SSE for server→client):

Client                    Server
  │    GET /sse           │
  │ ───────────────────▶  │
  │    (establishes SSE)  │
  │ ◀───────────────────  │
  │    event: endpoint    │
  │    data: /message?session=123
  │                       │
  │    POST /message?...  │
  │ ───────────────────▶  │
  │    (client requests)   │
  │                       │
  │    SSE: server pushes │
  │ ◀───────────────────  │

HTTP characteristics:

Aspect	Behavior	Implication
Network	Accessible remotely	Enables SaaS-style MCP services
Load balancing	HTTP-native	Works with existing infrastructure
Security	TLS + auth headers	Standard web security applies
State	Server maintains session state	Can reconnect if session preserved
Scaling	Horizontal scaling	Multiple server instances behind LB

HTTP works best for:

Multi-user SaaS integrations
Cloud-hosted AI tools
Microservices architecture
Scenarios requiring load balancing

WebSocket: Real-Time Bidirectional #

WebSocket transport provides full-duplex communication over a single persistent connection, ideal for high-frequency bidirectional messaging.

WebSocket mode is essentially HTTP transport with WebSocket upgrade—once established, both directions stream freely without polling overhead.

When to prefer WebSocket:

High-frequency tool invocations (many calls per second)
Real-time resource subscriptions (live dashboards)
Low-latency requirements (sub-50ms response times)
Chat-like interfaces with streaming responses

Transport Comparison Matrix #

Factor	stdio	HTTP	WebSocket
Local deployment	Excellent	Good	Good
Remote/network	Impossible	Excellent	Excellent
Latency	Minimal (~1ms)	Moderate (~50ms)	Low (~10ms)
Debugging	Very easy	Moderate	Moderate
Security model	OS process	TLS + auth	TLS + auth
Scalability	Single machine	Excellent	Good
Firewall friendly	N/A	Yes	Sometimes
Reconnect handling	None	Session restore	Automatic

Security Considerations by Transport #

stdio security:

Servers inherit the host process permissions
Use OS-level sandboxing (chroot, containers, permission dropping)
Audit server code—it's running with full user privileges
Environment variables are shared; be careful with secrets

HTTP/WebSocket security:

Always use TLS (wss://, https://) in production
Implement authentication: API keys, OAuth, or mTLS
Validate origin headers to prevent unwanted cross-origin access
Rate limiting and request size limits prevent abuse
Session timeout and cleanup prevent resource exhaustion

Example HTTP server with basic auth:

import express from 'express';
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { SSEServerTransport } from '@modelcontextprotocol/sdk/server/sse.js';

const app = express();

// Simple API key auth
const API_KEY = process.env.MCP_API_KEY;
app.use((req, res, next) => {
  const key = req.headers['x-api-key'];
  if (key !== API_KEY) {
    return res.status(401).json({ error: 'Unauthorized' });
  }
  next();
});

// SSE endpoint
app.get('/sse', async (req, res) => {
  const transport = new SSEServerTransport('/message', res);
  const server = new Server({ name: 'http-server', version: '1.0.0' }, 
    { capabilities: { tools: {} } });
  
  // ... set up handlers ...
  
  await server.connect(transport);
});

app.listen(3000);

Choosing Your Transport #

Use stdio when:

Building local development tools
Need OS-level process isolation
Debugging and iterating rapidly
Targeting Claude Desktop, Cursor, or similar local hosts

Use HTTP when:

Deploying to production SaaS
Serving multiple clients/users
Need load balancing and horizontal scaling
Integrating with existing web infrastructure

Use WebSocket when:

Real-time bidirectional streaming is required
Minimizing latency overhead matters
Building chat-like or live-update interfaces

In practice, most of my production servers use HTTP with SSE for server→client pushes. stdio is reserved for local development and single-user tools. I've yet to need WebSocket specifically—HTTP SSE provides sufficient real-time capability for most agentic workflows.

n8n and Remote MCP #

The n8n MCP community node currently supports HTTP transport, letting you wire remote MCP servers into workflows. This enables patterns like:

Calling external AI services through MCP protocol
Chaining multiple MCP servers in a single workflow
Using MCP tools alongside native n8n nodes

When self-hosting n8n with MCP support, configure your servers with HTTP transport and appropriate authentication. The node handles the JSON-RPC layer, exposing MCP tools as standard n8n workflow operations.

Tool Design Patterns for AI Agents #

Well-designed MCP tools follow consistent patterns: descriptive naming, explicit schemas, graceful error handling, and output formatting optimized for LLM consumption. The goal is making tools discoverable, predictable, and composable for AI agents.

Schema Design Best Practices #

Tool schemas are contracts between human developers and AI systems. Follow these patterns:

1. Use verb-noun naming consistently

Poor	Better	Best
`data`	`get_data`	`get_customer_record`
`user`	`user_search`	`search_users_by_email`
`process`	`process_file`	`extract_text_from_pdf`

Specific verbs (search, create, update, delete, send, extract) communicate intent. Specific nouns clarify domain scope.

2. Structure parameters for LLM understanding

{
  "properties": {
    "customer_id": {
      "type": "string",
      "description": "Unique customer identifier (e.g., 'cust_12345'). Found in customer record or previous search results."
    },
    "include_inactive": {
      "type": "boolean",
      "description": "If true, includes deactivated customer accounts. Defaults to false (only active customers)."
    }
  },
  "required": ["customer_id"]
}

Each description should answer:

What is this parameter? (type, format)
Where does the value come from? (context for the LLM)
What's the default if omitted?

3. Group related parameters with objects

Instead of flat parameters:

{
  "street": "...",
  "city": "...",
  "zip": "..."
}

Use nested structures:

{
  "address": {
    "street": "...",
    "city": "...",
    "zip": "..."
  }
}

This improves schema readability and lets you reuse object definitions across tools.

4. Enumerate when values are constrained

{
  "priority": {
    "type": "string",
    "enum": ["low", "medium", "high", "urgent"],
    "description": "Ticket priority level"
  }
}

Explicit enums prevent hallucinated values and make tool behavior predictable.

Error Handling Strategies #

MCP distinguishes two error categories: user errors (returned with isError: true) and system errors (thrown as JSON-RPC errors).

User errors (invalid input, business rule violations):

return {
  content: [
    {
      type: 'text',
      text: `Customer not found: No customer exists with ID "${customerId}". Use search_customers to find valid IDs.`
    }
  ],
  isError: true
};

System errors (bugs, network failures, crashes):

throw new McpError(
  ErrorCode.InternalError,
  'Database connection failed',
  { retryable: true, details: error.message }
);

Error message best practices:

Scenario	Approach	Example
Invalid input	Explain what's wrong, suggest fix	`"Invalid date format. Use YYYY-MM-DD (e.g., 2026-05-05)"`
Not found	Confirm the search term, suggest alternatives	`"No user found with email 'alice@test.com'. Did you mean 'alice@example.com'?"`
Permission denied	Explain required permissions	`"Access denied. Need 'admin' role to delete records."`
Rate limited	Include retry timing	`"Rate limit exceeded. Retry after 60 seconds."`
Ambiguous request	List options, ask for clarification	`"Found 3 projects named 'Website'. Specify: Website-2025, Website-2026, or Website-Redesign?"`

Making Tools LLM-Friendly #

LLMs work best with tools that follow predictable patterns. Design for agentic workflows:

1. Prefer small, composable tools over monolithic ones

Instead of one massive manage_customer tool:

// Composable, chainable tools:
search_customers       // Find candidates
create_customer        // Create new
get_customer           // Retrieve by ID
update_customer        // Modify existing
delete_customer        // Remove

This lets agents reason step-by-step: search → verify → act.

2. Return structured data in consistent formats

{
  "content": [
    {
      "type": "text",
      "text": "{\n  \"found\": 3,\n  \"customers\": [\n    {\"id\": \"c1\", \"name\": \"Alice\", \"email\": \"a@example.com\"},\n    {\"id\": \"c2\", \"name\": \"Bob\", \"email\": \"b@example.com\"},\n    {\"id\": \"c3\", \"name\": \"Carol\", \"email\": \"c@example.com\"}\n  ]\n}"
    }
  ]
}

Valid JSON in text content lets the LLM parse and reference results in subsequent calls.

3. Include pagination for list operations

interface ListInput {
  limit?: number;      // Default: 20, Max: 100
  offset?: number;     // For pagination
  cursor?: string;     // Alternative to offset
}

// Return metadata with results
{
  "data": [...],
  "pagination": {
    "total": 156,
    "returned": 20,
    "next_cursor": "eyJpZCI6IDIxfQ=="
  }
}

4. Support idempotency where possible

interface CreateInput {
  name: string;
  idempotency_key?: string;  // Client-generated UUID
}

Idempotency keys let agents retry failed calls safely.

Advanced Patterns #

Conditional tool availability:

server.setRequestHandler(ListToolsRequestSchema, async () => {
  const tools = [baseTools];
  
  // Add admin tools only for privileged sessions
  if (session.isAdmin) {
    tools.push(adminTools);
  }
  
  // Add beta tools for early access users
  if (session.features.includes('beta')) {
    tools.push(experimentalTools);
  }
  
  return { tools };
});

Tool chaining with dependencies:

// Tool A returns data that Tool B requires
interface ToolAOutput {
  result_id: string;  // Pass this to Tool B
  preview: string;     // LLM sees this, decides next step
}

interface ToolBInput {
  result_id: string;  // Reference to Tool A's output
  action: 'confirm' | 'modify';
}

Context-aware tools:

interface SearchInput {
  query: string;
  context?: string;  // Additional context from conversation
}

// Usage: agent passes conversation context to improve results
// query: "users who purchased"
// context: "The user is asking about customers from last month's campaign"

Testing Tools for LLM Compatibility #

Before shipping a tool, test with these prompts:

"What tools are available and what do they do?" — Verifies description clarity
"Search for X using the search tool" — Tests parameter understanding
"After finding X, perform Y on it" — Tests chaining capability
"Try to do Z with invalid input" — Tests error handling
"Do the same operation twice" — Tests idempotency

If the LLM struggles with any of these, your tool design needs refinement.

From Production Experience #

The best MCP tools I've built share a common trait: they're boringly predictable. Same parameter names across tools (id, limit, offset). Same response structure ({success: boolean, data: any, error?: string}). Same error message format. This consistency lets agents build reliable mental models—when they learn one tool, they can infer how others work.

The worst tools? Overly clever parameter names, inconsistent return formats, and ambiguous error messages that leave the agent guessing what went wrong. Save the creativity for your business logic, not your interface design.

MCP vs Traditional Function Calling: A Technical Comparison #

MCP represents a fundamental architectural shift from function calling—stateful bidirectional sessions replace stateless tool lists, runtime capability negotiation replaces static schemas, and server-hosted logic replaces client-side tool definitions. Understanding these differences helps you choose the right integration pattern.

OpenAI Function Calling #

OpenAI's function calling (introduced June 2023) embeds tool definitions in the chat completion API:

# OpenAI pattern: Tools defined client-side
response = client.chat.completions.create(
    model="gpt-4",
    messages=messages,
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    }],
    tool_choice="auto"
)

Key characteristics:

Aspect	OpenAI Functions	MCP Equivalent
Tool location	Defined in API request	Discovered from server
Transport	HTTP POST to OpenAI	JSON-RPC over configurable transport
Execution	Client receives tool_call, executes locally	Server receives request, executes remotely
Tool count	Limited by token budget (~128 functions)	Unlimited (servers paginate)
Bidirectional	No—one-way LLM→client	Yes—server can call sampling
State	Stateless per request	Stateful session

When OpenAI functions win:

Simple integrations with few tools
No need for persistent connections
Immediate deployment without server infrastructure
Working within OpenAI's ecosystem exclusively

When MCP wins:

Many tools or dynamically changing tool sets
Need for server-side logic and state
Multi-LLM support (same server, different clients)
Bidirectional capabilities (sampling, roots)

Claude Function Tools (Legacy) #

Claude's pre-MCP tool use followed patterns similar to OpenAI:

// Legacy Claude tool use
const message = await anthropic.messages.create({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  tools: [
    {
      name: 'get_stock_price',
      description: 'Get current stock price',
      input_schema: {
        type: 'object',
        properties: {
          symbol: { type: 'string' }
        }
      }
    }
  ],
  messages
});

Claude's shift to MCP:

As of late 2024, Claude Desktop and Claude Code transitioned to MCP as their native integration protocol. The legacy tool use API still works for direct API calls, but MCP is now the recommended architecture for:

Claude Desktop extensions
Local tool integrations
Multi-tool environments

The key difference: MCP servers can live anywhere (local process, remote HTTP service, container), while legacy tool definitions were always embedded in API requests.

LangChain Tools #

LangChain provides a Python/TypeScript framework for tool orchestration:

from langchain.tools import BaseTool
from pydantic import BaseModel, Field

class SearchCustomersInput(BaseModel):
    query: str = Field(description="Search query")

class SearchCustomersTool(BaseTool):
    name = "search_customers"
    description = "Search customers"
    args_schema = SearchCustomersInput
    
    def _run(self, query: str):
        return db.search(query)

LangChain vs MCP comparison:

Factor	LangChain	MCP
Type	Framework/library	Protocol/specification
Scope	End-to-end orchestration	Server/client protocol
Tool definition	Python/TS classes	JSON-RPC over transport
Interoperability	LangChain ecosystem	Any MCP-compatible client
Deployment	Library in application	Standalone server process
Language	Python/TypeScript	Any (SDKs available)

They're complementary, not competing:

LangChain can wrap MCP servers:

from langchain_mcp import MCPTool

# Use MCP server as LangChain tool
mcp_tool = MCPTool.from_server(
    command="python",
    args=["my_mcp_server.py"],
    tool_name="search_customers"
)

# Use in LangChain agent
agent = initialize_agent([mcp_tool, other_tools], llm)

This lets you:

Keep existing LangChain orchestration
Add MCP servers for standardized integrations
Gradually migrate from custom tools to MCP

Protocol-Level Architectural Differences #

Stateful vs Stateless:

OpenAI:    Request → Tools defined → Execute → End
           (Stateless, per-request)

MCP:       Connect → Handshake → Operation → Operation → ... → Disconnect
           (Stateful, persistent session)

Stateful sessions enable:

Runtime capability changes (server adds tools dynamically)
Resource subscriptions (live updates)
Bidirectional sampling (server asks LLM questions)
Connection-scoped permissions and roots

Transport flexibility:

Approach	Transport Options
OpenAI	HTTP only (their API)
Claude Legacy	HTTP only (their API)
LangChain	In-process, HTTP, custom
MCP	stdio, HTTP, WebSocket, custom

Discovery mechanism:

OpenAI:   Developer hardcodes tool definitions in code
          (tools = [...])

MCP:      Client queries server at runtime
          tools/list → Server returns current capabilities

Runtime discovery matters when:

Tool availability varies by user permissions
Tools are plugins loaded dynamically
Same server serves different capabilities to different clients

Migration Path: When to Switch #

Keep using traditional function calling when:

Building prototypes or MVPs
Only 2-3 simple tools needed
Single LLM provider (OpenAI only)
No need for persistent connections

Adopt MCP when:

Tool count exceeds 10-20
Same tools needed across multiple LLM clients
Tools require server-side state or logic
Building reusable integrations for distribution
Need bidirectional capabilities (sampling, roots)
Building SaaS products that expose AI-accessible APIs

Real-World Hybrid Approach #

Most production systems I work on use both:

MCP servers: Standard integrations (database, CRM, search) that multiple clients need
Client-side functions: Application-specific logic tied to UI state, ephemeral operations

Example architecture:

┌─────────────────────────────────────────┐
│           Application (Cursor)           │
│                                          │
│  ┌─────────────────┐  ┌──────────────┐   │
│  │ UI State Tools  │  │ MCP Client   │───┼──▶ Database MCP Server
│  │ (client-side)   │  │              │   │
│  │ - render_preview│  │              │───┼──▶ Search MCP Server
│  │ - validate_input│  │              │   │
│  └─────────────────┘  └──────────────┘   │
│                                          │
└─────────────────────────────────────────┘

Client-side tools handle UI-specific concerns. MCP servers handle reusable business logic. The boundary moves based on how broadly useful a capability is.

The Strategic View #

MCP isn't replacing function calling—it's standardizing the protocol layer beneath it. Think of it like HTTP: you can still build custom protocols, but HTTP wins for interoperability. MCP aims to be the HTTP of AI tool integration.

Anthropic's donation to the Linux Foundation signals this intent. The goal isn't Anthropic-specific tooling—it's a universal standard that OpenAI, Google, Meta, and others can adopt. When that happens, building an MCP server means your tools work everywhere. That's worth the architectural shift.

Real-World MCP Implementations and Server Registry #

The MCP ecosystem includes 200+ servers ranging from official Anthropic reference implementations to community-built integrations for databases, APIs, developer tools, and business applications. Understanding the available servers accelerates your own MCP adoption.

Official Anthropic Reference Servers #

Anthropic maintains a set of reference implementations demonstrating best practices:

Package	Capabilities	Use Case
`@modelcontextprotocol/server-filesystem`	Tools, Resources	Read/write local files, directory traversal
`@modelcontextprotocol/server-sqlite`	Tools, Resources	Query SQLite databases, schema introspection
`@modelcontextprotocol/server-brave-search`	Tools	Web and local search via Brave API
`@modelcontextprotocol/server-fetch`	Tools	HTTP request proxy with content extraction
`@modelcontextprotocol/server-github`	Tools	Repository operations, issues, PRs
`@modelcontextprotocol/server-postgres`	Tools, Resources	PostgreSQL queries and exploration
`@modelcontextprotocol/server-slack`	Tools	Send messages, search channels
`@modelcontextprotocol/server-puppeteer`	Tools	Browser automation, screenshots

Installation via npx (stdio transport):

# Filesystem server
npx -y @modelcontextprotocol/server-filesystem /path/to/allowed/dir

# SQLite server
npx -y @modelcontextprotocol/server-sqlite /path/to/database.db

# Brave search (requires API key)
BRAVE_API_KEY=xxx npx -y @modelcontextprotocol/server-brave-search

These servers use stdio transport by default, making them ideal for local development with Claude Desktop or Cursor.

Community Server Ecosystem #

The community registry (accessible via mcp.run and GitHub topics) includes servers for:

Databases & Storage:

MongoDB, Redis, DynamoDB
Supabase, Firebase
ClickHouse, DuckDB

Developer Tools:

Docker, Kubernetes
Git operations (beyond GitHub)
Testing frameworks (Jest, pytest)
Code quality (ESLint, Prettier, Ruff)

Business Applications:

Salesforce, HubSpot, Pipedrive
Stripe, PayPal
Notion, Asana, Linear, Monday
Shopify, WooCommerce

AI & Data:

Vector databases (Pinecone, Weaviate, Chroma)
LLM API proxies (OpenAI, Anthropic, Gemini)
Data processing (Pandas, Polars transformations)

Search & Knowledge:

Elasticsearch, Meilisearch
Confluence, Notion wikis
Custom knowledge bases

n8n MCP Integration #

n8n (the workflow automation platform) supports MCP through its MCP community node, enabling AI agents to orchestrate complex multi-step automations.

Architecture with n8n:

┌─────────────┐      MCP HTTP       ┌─────────────┐
│   Claude    │ ◀─────────────────▶ │    n8n      │
│   Desktop   │    JSON-RPC         │   MCP Node  │
└─────────────┘                     └──────┬──────┘
                                           │
                          ┌────────────────┼────────────────┐
                          │                │                │
                          ▼                ▼                ▼
                    ┌──────────┐     ┌──────────┐     ┌──────────┐
                    │ HTTP API │     │ Database │     │  Custom  │
                    │ Request  │     │  Query   │     │   Logic  │
                    └──────────┘     └──────────┘     └──────────┘

Use cases for n8n + MCP:

Multi-step API orchestration: Claude requests "onboard a new customer," n8n executes: create CRM record → send welcome email → provision trial account → notify Slack
Conditional logic: MCP tools can include n8n workflow decisions without hardcoding them in the prompt
Integration hub: Single MCP server endpoint exposing dozens of integrated services
Audit and logging: n8n captures every tool invocation with full traceability

Configuration example in n8n:

{
  "mcpServers": {
    "n8n-workflows": {
      "url": "https://n8n.your-domain.com/mcp",
      "headers": {
        "Authorization": "Bearer xxx"
      }
    }
  }
}

Self-Hosted Server Deployment Patterns #

For production use, you'll typically self-host MCP servers rather than relying on npx:

Docker Compose stack:

version: '3.8'
services:
  mcp-filesystem:
    image: mcp/filesystem:latest
    volumes:
      - ./data:/data:ro
    environment:
      - ALLOWED_DIRS=/data
    # Exposes HTTP transport on port 3001
    ports:
      - "3001:3000"
  
  mcp-database:
    image: mcp/postgres:latest
    environment:
      - DATABASE_URL=postgresql://db:5432/prod
    ports:
      - "3002:3000"
  
  # n8n with MCP node
  n8n:
    image: n8nio/n8n:latest
    environment:
      - N8N_BASIC_AUTH_ACTIVE=true
    volumes:
      - n8n_data:/home/node/.n8n
    ports:
      - "5678:5678"

Kubernetes deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mcp-server
spec:
  replicas: 2
  selector:
    matchLabels:
      app: mcp-customer-api
  template:
    metadata:
      labels:
        app: mcp-customer-api
    spec:
      containers:
      - name: server
        image: your-registry/mcp-customer-api:v1.2.0
        env:
        - name: API_KEY
          valueFrom:
            secretKeyRef:
              name: mcp-secrets
              key: api-key
        ports:
        - containerPort: 3000
        readinessProbe:
          httpGet:
            path: /health
            port: 3000

Evaluating Community Servers #

Before adopting a community MCP server, evaluate:

Criteria	What to Check	Red Flags
Maintenance	Last commit, issue response time	>6 months inactive
Security	Auth implementation, secret handling	Hardcoded credentials, no auth
Completeness	Which MCP primitives implemented	Missing error handling
Documentation	README, examples, parameter docs	No usage examples
Test coverage	Unit tests, integration tests	No test suite
Transport support	stdio, HTTP, or both	Only stdio (limits deployment)

For production systems, I prefer to fork and audit community servers rather than depend on them directly. The protocol is simple enough that maintaining a fork is lower risk than trusting unaudited code with production data access.

Building Your Own Server Registry #

For organizations with multiple internal MCP servers, consider maintaining a private registry:

{
  "servers": [
    {
      "name": "internal-crm",
      "transport": "http",
      "url": "https://mcp-crm.internal.company.com",
      "auth": "api_key_header",
      "environments": ["prod", "staging"]
    },
    {
      "name": "deploy-orchestrator",
      "transport": "stdio",
      "command": "python -m mcp.deploy",
      "env": { "KUBECONFIG": "/etc/kube/config" }
    }
  ]
}

This registry document becomes your organization's MCP service catalog—what's available, how to connect, and which environments are supported.

From the Field #

In my current client work, the MCP servers getting the most use are:

A custom CRM connector (built on the pattern from Section 6) that bridges a proprietary customer database with Claude Desktop for sales analysis
An n8n-hosted orchestration server that wraps 15+ SaaS integrations into a unified tool set
A filesystem + git combo for codebase operations—let Claude read files, make changes, commit with conventional messages, and push branches

The pattern that works: official servers for standard needs (filesystem, SQLite, Brave), custom servers for proprietary systems, and n8n for complex multi-step workflows. This covers 90% of integration scenarios without reinventing wheels.

Security Model and Permission Architecture #

MCP's security model combines capability-based permissions, transport-layer isolation, and host-mediated user consent to create defense in depth for AI agent integrations. Understanding these layers is critical for production deployments.

Capability-Based Security #

MCP uses capability-based security where servers declare what they can do, clients declare what they can offer, and users approve the intersection. Nothing happens without explicit capability exchange.

The capability negotiation flow:

1. Client sends:    I support sampling, roots
                     I need tools, resources

2. Server responds:   I provide tools, resources
                     I need sampling (optional)

3. User confirms:   Approve this server to use tools?
                     (Host presents UI)

4. Operations:      Both sides respect agreed boundaries

Server-side capabilities (what the server offers):

tools: Can execute functions on request
resources: Can provide addressable data
prompts: Can offer templated workflows
logging: Can emit structured log events

Client-side capabilities (what the client offers back):

sampling: Can request LLM completions
roots: Can provide filesystem scoping
experimental: Feature flags for preview features

This negotiation means a server can't call sampling/createMessage unless the client announced sampling support in initialization. The protocol enforces boundaries at the wire level.

Server Isolation Patterns #

Each MCP connection runs in an isolated trust boundary. A compromised server cannot affect other servers or the host beyond its declared capabilities.

Isolation mechanisms by transport:

Transport	Isolation Mechanism	Scope
stdio	OS process boundaries	Server can only access process permissions
HTTP	Network boundaries + auth	Server can only access network-granted resources
WebSocket	Same as HTTP	Same as HTTP

stdio isolation best practices:

// Drop permissions after starting (Node.js)
import { userInfo } from 'os';

if (process.getuid && process.getuid() === 0) {
  const targetUser = userInfo();
  process.setgid(targetUser.gid);
  process.setuid(targetUser.uid);
}

# Container security (Docker)
FROM node:20-alpine
RUN addgroup -g 1001 -S mcp && adduser -u 1001 -S mcp -G mcp
USER mcp
# Server now runs as non-root

HTTP isolation best practices:

Run servers in separate containers/VMs
Use network policies to restrict egress
Implement request timeouts and size limits
Rotate authentication credentials regularly

The host mediates all capability access through user consent interfaces. MCP deliberately doesn't automate consent—every server connection requires explicit user approval.

Typical consent UI patterns:

┌─────────────────────────────────────────┐
│  New Server Connection Request          │
│                                         │
│  Server: "Company CRM MCP"              │
│  Origin: https://mcp.company.com        │
│                                         │
│  This server wants to:                  │
│    ✓ Execute tools (5 available)        │
│    ✓ Access resources (database)        │
│    ✓ Request AI assistance              │
│                                         │
│  [View Details]  [Deny]  [Allow]        │
└─────────────────────────────────────────┘

Granular permissions (implemented by quality hosts):

Level	Control	Example
Connection	Allow/deny server entirely	Block untrusted origins
Capability	Enable/disable primitives	Allow tools but not sampling
Tool	Per-tool allowlist	Allow `search_customers`, deny `delete_customer`
Resource	URI pattern filtering	Allow `file://docs/`, deny `file://secrets/`

Claude Desktop implements progressive consent—first connection requires approval, subsequent reconnections may be automatic for trusted servers, and sensitive operations trigger re-confirmation.

Production Security Checklist #

Before deploying MCP servers to production, verify:

Authentication & Authorization:

HTTP servers require authentication (API key, OAuth, mTLS)
API keys rotate regularly and store in secrets managers
Servers validate caller identity before executing sensitive tools
No hardcoded credentials in server code or configs

Transport Security:

TLS 1.3 for all HTTP/WebSocket connections
Certificate pinning for internal services
stdio servers run with minimal OS permissions
Network policies restrict server egress

Input Validation:

JSON Schema validation on all inputs
Parameter sanitization (prevent injection attacks)
Size limits on request/response payloads
Timeout guards on long-running operations

Audit & Monitoring:

Log all tool invocations with caller identity
Alert on error rate spikes
Track access patterns for anomaly detection
Retain audit logs for compliance requirements

Error Handling:

Sanitize error messages (don't leak internal paths)
Distinguish user errors from system errors
Fail closed (deny on uncertainty)
Rate limiting on expensive operations

Threat Model #

Primary threats and mitigations:

Threat	Impact	Mitigation
Server compromise	Attacker gains server capabilities	Process isolation, minimal permissions, input validation
Man-in-the-middle	Intercept/modify JSON-RPC	TLS 1.3, certificate pinning
Credential theft	Unauthorized server access	Secrets management, rotation, short-lived tokens
Prompt injection	Attacker controls tool via crafted input	Input validation, output encoding, context limits
Resource exhaustion	DoS via expensive operations	Rate limiting, timeouts, quotas
Data exfiltration	Sensitive data via tool responses	Data classification, output filtering, audit logging

Prompt injection deserves special attention in MCP contexts. Since LLMs generate tool arguments, an attacker who controls the LLM's context (via a malicious webpage, PDF, or conversation) might craft inputs that abuse available tools.

Defense patterns:

// Validate against expected patterns
const SAFE_EMAIL = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
if (!SAFE_EMAIL.test(args.email)) {
  throw new Error('Invalid email format');
}

// Escape outputs that will be rendered
function escapeHtml(text: string): string {
  return text
    .replace(/&/g, '&amp;')
    .replace(/</g, '&lt;')
    .replace(/>/g, '&gt;');
}

// Limit context window for sensitive tools
const MAX_CONTEXT = 4000;
if (args.context.length > MAX_CONTEXT) {
  args.context = args.context.slice(0, MAX_CONTEXT) + '...[truncated]';
}

Security-First Deployment Example #

Here's a hardened production MCP server configuration:

import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import helmet from 'helmet';
import rateLimit from 'express-rate-limit';

// Rate limiting
const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // limit each IP to 100 requests per windowMs
  standardHeaders: true,
  legacyHeaders: false,
});

// API key validation
function validateApiKey(req: Request, res: Response, next: NextFunction) {
  const key = req.headers['x-api-key'];
  const validKey = process.env.MCP_API_KEY;
  
  if (!validKey) {
    console.error('MCP_API_KEY not configured');
    return res.status(500).json({ error: 'Server misconfigured' });
  }
  
  if (key !== validKey) {
    return res.status(401).json({ error: 'Invalid API key' });
  }
  
  next();
}

// Input size limits
app.use(express.json({ limit: '1mb' }));

// Security headers
app.use(helmet());

// Apply to all MCP endpoints
app.use('/mcp', limiter, validateApiKey);

This layering—rate limits, auth, size constraints, security headers—provides defense in depth against common attack vectors.

From Experience #

The security incidents I've seen in MCP deployments aren't exotic protocol exploits—they're basic oversights: servers running as root, API keys committed to GitHub, missing input validation that lets users read arbitrary files. MCP's protocol security is sound; implementation discipline matters more.

My rule: treat MCP servers with the same security rigor as any production API. They have the same blast radius—often more, since they're designed for automated access.

Deploying MCP at Scale: Production Patterns #

Production MCP deployments require orchestrating multiple servers, managing connection lifecycles, implementing health monitoring, and designing for failure modes that don't exist in single-server local development. These patterns bridge from "it works on my machine" to "it works at enterprise scale."

Multi-Server Orchestration #

Enterprise deployments typically involve 10-50+ MCP servers serving different teams, systems, and use cases. Managing this fleet requires a registry pattern and connection management layer.

Server registry architecture:

┌──────────────────────────────────────────────────────────┐
│                      HOST LAYER                           │
│  ┌─────────────────┐                                    │
│  │  MCP Gateway    │◀── Connection management            │
│  │  (Registry)     │◀── Health monitoring               │
│  └────────┬────────┘◀── Request routing                  │
│           │                                              │
│  ┌────────┴────────┐  ┌─────────────┐  ┌─────────────┐ │
│  │ Client Pool A   │  │ Client Pool B│  │ Client Pool C│ │
│  │ (CRM servers)   │  │ (Dev tools)  │  │ (Analytics)  │ │
│  └────────┬────────┘  └──────┬───────┘  └──────┬────────┘ │
└───────────┼──────────────────┼──────────────────┼──────────┘
            │                  │                  │
     ┌──────┴──────┐    ┌──────┴──────┐   ┌──────┴──────┐
     │ CRM Server 1│    │ Git Server  │   │ BI Server   │
     │ CRM Server 2│    │ Docker Svr  │   │ Metrics Svr │
     │ CRM Server 3│    │ K8s Server  │   │ Logs Server │
     └─────────────┘    └─────────────┘   └─────────────┘

Registry implementation (simplified):

interface ServerRegistration {
  id: string;
  name: string;
  transport: 'stdio' | 'http' | 'websocket';
  connection: ConnectionConfig;
  capabilities: ServerCapabilities;
  healthStatus: 'healthy' | 'degraded' | 'unhealthy';
  lastHealthCheck: Date;
  metadata: {
    team: string;
    environment: 'prod' | 'staging' | 'dev';
    priority: number;
  };
}

class MCPGateway {
  private servers = new Map<string, ServerRegistration>();
  private clients = new Map<string, MCPClient>();
  
  async registerServer(config: ServerRegistration): Promise<void> {
    // Create connection
    const client = await this.createClient(config);
    this.clients.set(config.id, client);
    this.servers.set(config.id, config);
    
    // Start health monitoring
    this.startHealthChecks(config.id);
  }
  
  async routeRequest(serverId: string, request: JSONRPCRequest): Promise<JSONRPCResponse> {
    const client = this.clients.get(serverId);
    if (!client || !client.isConnected()) {
      throw new Error(`Server ${serverId} unavailable`);
    }
    return client.sendRequest(request);
  }
}

Connection Pooling and Lifecycle #

Stateful MCP connections benefit from pooling, reconnection logic, and graceful degradation.

Connection states:

DISCONNECTED → CONNECTING → INITIALIZING → READY → OPERATIONAL
     ↑           │              │            │         │
     └───────────┴──────────────┴────────────┴─────────┘
                    (reconnection cycle)

Connection pool manager:

interface PooledConnection {
  id: string;
  client: MCPClient;
  state: ConnectionState;
  createdAt: Date;
  lastUsed: Date;
  requestCount: number;
  errorCount: number;
}

class ConnectionPool {
  private pool: Map<string, PooledConnection> = new Map();
  private maxConnections = 10;
  private maxRequestsPerConnection = 1000;
  private idleTimeoutMs = 5 * 60 * 1000; // 5 minutes
  
  async acquire(serverId: string): Promise<PooledConnection> {
    // Find healthy existing connection
    const existing = this.findHealthyConnection(serverId);
    if (existing) {
      existing.lastUsed = new Date();
      return existing;
    }
    
    // Create new if under limit
    if (this.pool.size < this.maxConnections) {
      return this.createConnection(serverId);
    }
    
    // Evict oldest and create new
    await this.evictOldest();
    return this.createConnection(serverId);
  }
  
  async release(conn: PooledConnection): Promise<void> {
    // Return to pool, schedule idle cleanup
    setTimeout(() => this.checkIdle(conn), this.idleTimeoutMs);
  }
  
  private async checkIdle(conn: PooledConnection): Promise<void> {
    const idleTime = Date.now() - conn.lastUsed.getTime();
    if (idleTime > this.idleTimeoutMs) {
      await this.closeConnection(conn);
    }
  }
}

Reconnection strategy with exponential backoff:

class ResilientClient {
  private reconnectAttempts = 0;
  private maxReconnectAttempts = 5;
  private baseDelayMs = 1000;
  private maxDelayMs = 30000;
  
  async connect(): Promise<void> {
    try {
      await this.doConnect();
      this.reconnectAttempts = 0;
    } catch (error) {
      if (this.reconnectAttempts < this.maxReconnectAttempts) {
        const delay = Math.min(
          this.baseDelayMs * Math.pow(2, this.reconnectAttempts),
          this.maxDelayMs
        );
        this.reconnectAttempts++;
        
        console.log(`Reconnect attempt ${this.reconnectAttempts} in ${delay}ms`);
        await sleep(delay);
        return this.connect();
      }
      throw error;
    }
  }
}

Health Checks and Monitoring #

Production MCP requires health checking at multiple levels:

Check Level	What to Verify	Frequency
Transport	TCP/WebSocket connected	Every 30s
Protocol	JSON-RPC round-trip works	Every 60s
Capability	Tools/resources respond	Every 5 min
Business	End-to-end tool execution	Every 5 min

Health check implementation:

async function healthCheck(client: MCPClient): Promise<HealthStatus> {
  try {
    // Protocol-level ping
    const ping = await client.sendRequest({
      jsonrpc: '2.0',
      id: 'health-ping',
      method: 'ping',
      params: {}
    });
    
    if (ping.error) {
      return { status: 'unhealthy', reason: 'ping_failed' };
    }
    
    // Capability check: list tools
    const tools = await client.sendRequest({
      jsonrpc: '2.0',
      id: 'health-tools',
      method: 'tools/list',
      params: {}
    });
    
    if (tools.error) {
      return { status: 'degraded', reason: 'tools_unavailable' };
    }
    
    return { status: 'healthy', toolCount: tools.result?.tools?.length };
  } catch (error) {
    return { status: 'unhealthy', reason: error.message };
  }
}

Metrics to track:

interface MCPMetrics {
  // Connection metrics
  connectionsActive: Gauge;
  connectionsTotal: Counter;
  connectionErrors: Counter;
  reconnectionAttempts: Counter;
  
  // Request metrics
  requestsTotal: Counter;
  requestDuration: Histogram;
  requestErrors: Counter;
  
  // Tool-specific metrics
  toolInvocations: Counter; // labeled by tool name
  toolDuration: Histogram;  // labeled by tool name
  toolErrors: Counter;      // labeled by tool name, error type
}

Prometheus/Grafana integration example:

import { register, Counter, Histogram } from 'prom-client';

const requestDuration = new Histogram({
  name: 'mcp_request_duration_seconds',
  help: 'Duration of MCP requests',
  labelNames: ['method', 'status'],
  buckets: [0.01, 0.05, 0.1, 0.5, 1, 2, 5]
});

// In request handler:
const end = requestDuration.startTimer();
try {
  const result = await handleRequest(request);
  end({ method: request.method, status: 'success' });
  return result;
} catch (error) {
  end({ method: request.method, status: 'error' });
  throw error;
}

Load Balancing Strategies #

For HTTP transport, standard load balancing applies with MCP-specific considerations:

Sticky sessions vs. stateless:

Approach	When to Use	Implementation
Sticky sessions	Stateful servers that track session data	Cookie or IP-based routing
Stateless	Servers that store no session state	Round-robin, least-connections
Session affinity	Mixed state (some tools stateful, others not)	Route by tool category

Nginx upstream configuration:

upstream mcp_servers {
    least_conn;
    server mcp-server-1:3000 max_fails=3 fail_timeout=30s;
    server mcp-server-2:3000 max_fails=3 fail_timeout=30s;
    server mcp-server-3:3000 max_fails=3 fail_timeout=30s;
    
    keepalive 32;
    keepalive_requests 1000;
    keepalive_timeout 60s;
}

server {
    location /mcp {
        proxy_pass http://mcp_servers;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        
        # Timeouts
        proxy_connect_timeout 5s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }
}

Graceful Shutdown Patterns #

Production servers must handle shutdown signals gracefully, completing in-flight requests before exiting:

class GracefulServer {
  private shuttingDown = false;
  private activeRequests = 0;
  
  async start(): Promise<void> {
    // Set up signal handlers
    process.on('SIGTERM', () => this.shutdown('SIGTERM'));
    process.on('SIGINT', () => this.shutdown('SIGINT'));
    
    // Start server
    await this.server.listen(3000);
  }
  
  private async shutdown(signal: string): Promise<void> {
    console.log(`Received ${signal}, starting graceful shutdown...`);
    this.shuttingDown = true;
    
    // Stop accepting new connections
    this.server.close();
    
    // Wait for active requests to complete (with timeout)
    const timeout = setTimeout(() => {
      console.error('Shutdown timeout exceeded, forcing exit');
      process.exit(1);
    }, 30000);
    
    while (this.activeRequests > 0) {
      console.log(`Waiting for ${this.activeRequests} active requests...`);
      await sleep(1000);
    }
    
    clearTimeout(timeout);
    console.log('Graceful shutdown complete');
    process.exit(0);
  }
  
  private async handleRequest(req: Request, res: Response): Promise<void> {
    if (this.shuttingDown) {
      res.status(503).json({ error: 'Server shutting down' });
      return;
    }
    
    this.activeRequests++;
    try {
      await this.processRequest(req, res);
    } finally {
      this.activeRequests--;
    }
  }
}

Kubernetes Deployment Pattern #

Full production-ready Kubernetes manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mcp-customer-api
  labels:
    app: mcp-customer-api
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: mcp-customer-api
  template:
    metadata:
      labels:
        app: mcp-customer-api
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "3000"
        prometheus.io/path: "/metrics"
    spec:
      containers:
      - name: server
        image: registry/mcp-customer-api:v1.2.0
        ports:
        - containerPort: 3000
          name: http
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: mcp-secrets
              key: database-url
        - name: MCP_API_KEY
          valueFrom:
            secretKeyRef:
              name: mcp-secrets
              key: api-key
        livenessProbe:
          httpGet:
            path: /health/live
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health/ready
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 5
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 15"]
      terminationGracePeriodSeconds: 60
---
apiVersion: v1
kind: Service
metadata:
  name: mcp-customer-api
spec:
  selector:
    app: mcp-customer-api
  ports:
  - port: 80
    targetPort: 3000
  type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: mcp-customer-api
spec:
  podSelector:
    matchLabels:
      app: mcp-customer-api
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: gateway
    ports:
    - protocol: TCP
      port: 3000
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: database
    ports:
    - protocol: TCP
      port: 5432

Scaling Considerations #

Horizontal vs. vertical scaling for MCP:

Factor	Horizontal Scaling	Vertical Scaling
Tool execution	Distribute load across instances	More CPU/memory per heavy operation
Stateful connections	Requires session affinity	Single instance handles all state
Tool diversity	Shard by tool category	All tools on all instances
Cost efficiency	Better for bursty traffic	Better for steady-state
Complexity	Load balancing, session mgmt	Simpler deployment

My recommendation: Start vertical, add horizontal when:

Single instance CPU > 70% sustained
Memory pressure from concurrent operations
Need multi-region deployment
Different teams need isolated capacity

From Production Experience #

The failures I've seen at scale aren't in the MCP protocol—they're in the surrounding infrastructure: connection leaks, missing health checks, unbounded request queues, secrets not rotating. The protocol is stable; treat the operational layers with the same rigor you apply to any production service.

The patterns above (pools, health checks, graceful shutdown, observability) are table stakes for production MCP. Skip them and you'll learn why the hard way.

The Future of MCP: Standardization and Ecosystem Growth #

MCP's trajectory points toward universal adoption as the interoperability layer for AI agents, with the Linux Foundation's Agentic AI Foundation governing its evolution toward a true industry standard. Understanding this roadmap helps you place strategic bets on your architecture.

Linux Foundation Governance #

Anthropic donated MCP to the Agentic AI Foundation under the Linux Foundation in December 2025. This move mirrors the governance model that made Kubernetes, GraphQL, and Node.js successful—vendor-neutral stewardship with multi-party contribution.

Governance implications:

Aspect	Before (Anthropic-controlled)	After (Linux Foundation)
Specification	Anthropic published	Open RFC process with working groups
SDKs	Official TypeScript/Python	Multi-language, community-contributed
Roadmap	Anthropic internal	Public, community-driven
Contribution	CLA required	Standard open-source contribution model
Certification	None	Potential compliance/certification programs

What this means for adopters:

Protocol stability guarantees (breaking changes require committee approval)
Multi-vendor support commitment (not just Anthropic/Claude)
Long-term maintenance assurance (foundation backing)
Legal clarity (standard open-source licensing)

Expanding Client Ecosystem #

MCP client support is broadening beyond Anthropic's applications to become a standard capability in AI tools:

Client	MCP Support Status	Transport	Notes
Claude Desktop	Native	stdio	Reference implementation
Claude Code	Native	stdio	Full server support
Cursor	Native	stdio	Settings UI for server management
Continue	Native	stdio	Open-source IDE extension
Zed	In development	stdio	Rust-based editor
Cody (Sourcegraph)	Planned	HTTP	Enterprise code AI
Custom apps	Via SDKs	All	TypeScript/Python SDKs available

The tipping point: When 3+ major AI clients support MCP natively, server developers achieve true "write once, run anywhere" capability. We're approaching that threshold in mid-2026.

Protocol Evolution Roadmap #

The MCP specification continues evolving through working group proposals:

Released (2024-2025):

Core protocol (tools, resources, prompts)
stdio and HTTP transports
Sampling and roots capabilities
TypeScript and Python SDKs

In development (2026):

Streaming responses: Native support for tool output streaming (partial results, progress updates)
Batch operations: Multi-tool invocation in single request
Enhanced auth: OAuth 2.0 and OIDC integration patterns
Resource templating: Dynamic resource URI patterns with parameters

Proposed (2026-2027):

Server discovery: DNS-SD/mDNS for local server auto-discovery
Federation: Server-to-server delegation and capability proxying
Workflow primitives: Built-in support for multi-step tool orchestration
Evaluation framework: Standardized agent capability testing

Experimental:

Multi-modal content: Native image/audio/video in tool responses
Persistent sessions: Server-side session resumption across reconnections
Capability negotiation 2.0: Fine-grained permission grants per-tool

Strategic Implications for Builders #

Bet on MCP when:

Building tool/platform integrations: If you're exposing an API that AI agents should use, MCP is becoming the standard interface. Building REST + GraphQL + MCP is redundant—MCP covers the AI use case.
Creating AI agent platforms: Your platform needs to speak MCP to participate in the ecosystem. Clients without MCP support will face integration friction.
Standardizing internal AI tools: Enterprises with 50+ internal tools should unify behind MCP rather than maintaining custom integration code per tool.
Building SaaS products: MCP-enabled products can be discovered and used by any MCP-compatible client—distribution advantage similar to being in an app store.

Hold off when:

Simple prototypes with 2-3 tools: Traditional function calling is faster for MVPs.
Deeply embedded single-LLM systems: If you're all-in on OpenAI with no multi-client plans, their native tool calling may suffice.
Real-time latency-critical systems: MCP's JSON-RPC overhead (vs. binary protocols) matters for sub-10ms requirements.

Ecosystem Growth Metrics #

Tracking MCP adoption through observable signals:

Metric	Mid-2024	Early 2026	Trend
Official servers	5	15+	Growing
Community servers	~20	200+	10x growth
SDK languages	2	4 (TS, Python, Go, Rust)	Expanding
Client implementations	2	6+	Accelerating
GitHub repos tagged MCP	~100	1,500+	Viral growth
n8n MCP workflows	~50	500+	Enterprise adoption

The curve is exponential. Early 2026 marks the transition from "early adopter" to "early majority" in the technology adoption lifecycle.

The Multi-Agent Future #

MCP's architecture enables a future where specialized AI agents collaborate through standardized tool interfaces:

┌─────────────────────────────────────────────────────────┐
│              ORCHESTRATOR AGENT                          │
│  "Coordinate the customer onboarding workflow"           │
└────────────┬──────────────┬──────────────┬─────────────┘
             │              │              │
     ┌───────▼─────┐ ┌──────▼──────┐ ┌──────▼──────┐
     │ CRM Agent   │ │ Email Agent │ │ Billing     │
     │ (MCP client)│ │ (MCP client)│ │ Agent       │
     └──────┬──────┘ └──────┬──────┘ └──────┬──────┘
            │               │               │
     ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
     │ CRM Server  │ │ SendGrid    │ │ Stripe      │
     │             │ │ Server      │ │ Server      │
     └─────────────┘ └─────────────┘ └─────────────┘

Each agent specializes:

CRM Agent understands customer data models
Email Agent handles deliverability and templates
Billing Agent manages payment flows and invoicing

The orchestrator doesn't need to understand Stripe's API—it delegates to the Billing Agent via MCP. This composability is MCP's long-term promise: not just tool access, but agent collaboration.

The Standardization Bet #

I'm betting on MCP because protocols win. The history of technology is littered with superior proprietary solutions that lost to good-enough open standards:

REST/HTTP beat CORBA, SOAP, and countless enterprise protocols
SQL beat dozens of database query languages
TCP/IP beat OSI and proprietary networking stacks

MCP is positioned to be the REST of AI agent integration—the good-enough, widely-adopted standard that becomes the default choice by virtue of ubiquity.

The Linux Foundation governance cements this trajectory. Anthropic surrendered control to accelerate adoption—a classic "commoditize your complement" strategy that benefits the entire ecosystem.

Preparing for the Future #

Action items for MCP adopters:

Audit your integrations: Map current custom tool code to potential MCP servers. What's the migration path?
Experiment with official servers: Get hands-on with filesystem, SQLite, and Brave servers to understand the developer experience.
Build one custom server: Nothing teaches MCP like implementing the protocol. Start with a simple internal tool.
Monitor the specification: Follow the Agentic AI Foundation working groups for roadmap visibility.
Contribute to the ecosystem: Open-source your servers, report SDK bugs, write tutorials—protocols thrive on community investment.

The builders who understand MCP deeply in 2026 will have architectural advantages as the ecosystem matures. This isn't just about tooling—it's about positioning for the multi-agent, protocol-driven future of AI systems.

FAQ: Model Context Protocol #

What is Model Context Protocol (MCP)? #

MCP is an open standard protocol that enables AI systems to access external tools, data sources, and workflows through a unified client-server architecture. Anthropic introduced MCP in November 2024 and donated it to the Linux Foundation's Agentic AI Foundation in December 2025. It uses JSON-RPC 2.0 for message exchange and supports stdio, HTTP, and WebSocket transports.

How does MCP differ from traditional REST APIs? #

MCP uses stateful bidirectional sessions with runtime capability negotiation, while REST uses stateless request-response with static endpoint definitions. Key differences: MCP servers declare capabilities during initialization (tools, resources, prompts), support bidirectional messaging (servers can request LLM sampling), and maintain persistent connections. REST requires clients to know endpoints in advance and cannot easily support server-initiated requests or dynamic capability discovery.

What are the three types of MCP server capabilities? #

The three MCP primitives are Tools (executable functions), Resources (addressable data), and Prompts (templated workflows). Tools let agents take actions like querying databases or sending emails. Resources provide read-only data access with URI addressing like file:///path or db://table/row. Prompts package reusable instruction templates for common tasks like "analyze this code for security issues."

Is MCP only for Claude, or does it work with other LLMs? #

MCP is an open standard that works with any AI client implementing the protocol. Claude Desktop, Claude Code, and Cursor support MCP natively as of early 2026. The Linux Foundation governance ensures multi-client support is a priority—OpenAI, Google, and other providers can adopt MCP without licensing concerns. Custom clients can use the TypeScript or Python SDKs to add MCP support.

What transport protocols does MCP support? #

MCP supports three transports: stdio (standard input/output) for local processes, HTTP for remote services, and WebSocket for real-time bidirectional streaming. stdio is the default for local development with Claude Desktop and Cursor. HTTP enables SaaS-style deployments with load balancing and authentication. WebSocket provides lower-latency streaming for high-frequency operations.

How do I build my own MCP server? #

Build an MCP server by implementing the JSON-RPC protocol using the official TypeScript or Python SDKs. The minimal implementation requires: capability declaration during initialization, request handlers for tools/list and tools/call, and transport setup (stdio or HTTP). The SDKs handle protocol boilerplate—you implement business logic in handlers. Test with the MCP Inspector (npx @modelcontextprotocol/inspector) before deploying.

What is the JSON-RPC 2.0 message format in MCP? #

MCP messages follow JSON-RPC 2.0 with four types: Requests (method + params + id), Success Responses (result + id), Error Responses (error code + message + id), and Notifications (method + params, no id). Example request: {"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"search","arguments":{"query":"test"}}}. The protocol uses standard error codes (-32700 to -32099) for protocol-level issues and returns business logic errors in the result with isError: true.

How does MCP handle security and permissions? #

MCP uses capability-based security with host-mediated user consent—servers declare what they offer, clients declare what they can provide back, and users approve the intersection. Key security features: 1:1 connection isolation (each server gets its own client), OS-level process boundaries for stdio transport, TLS + authentication for HTTP transport, and granular permission controls (per-tool, per-resource allowlisting). Production deployments should implement API key auth, rate limiting, and input validation.

Can MCP servers call back into the client for LLM sampling? #

Yes, the sampling capability lets MCP servers request LLM completions from the host through sampling/createMessage. This bidirectional pattern enables servers to delegate cognitive work—like summarizing search results or analyzing error messages—without embedding their own LLM API keys. Servers declare sampling need during initialization; clients that support it can provide completions from their configured models (Claude, GPT, Gemini, etc.).

What is the MCP ecosystem and server registry? #

The MCP ecosystem includes 200+ servers ranging from official Anthropic implementations to community-built integrations. Official servers include filesystem, SQLite, Brave Search, GitHub, PostgreSQL, and Slack connectors. The community registry covers databases (MongoDB, Redis), business apps (Salesforce, Stripe), developer tools (Docker, Kubernetes), and AI infrastructure (vector databases, LLM proxies). Access via npm (npx @modelcontextprotocol/server-*) or the mcp.run registry.

How does MCP compare to OpenAI's function calling? #

MCP replaces static function definitions with dynamic server-hosted capabilities. OpenAI function calling embeds tool schemas in API requests with stateless execution—clients receive tool calls and execute locally. MCP uses stateful sessions where servers host tools remotely, enabling runtime capability changes, resource subscriptions, and bidirectional sampling. MCP supports unlimited tools (servers paginate), while OpenAI's function calling is limited by token budget constraints (~128 functions).

Is MCP production-ready for enterprise use? #

Yes, with proper operational patterns: connection pooling, health checks, graceful shutdown, and security hardening. The protocol itself is stable (version 2024-11-05, governed by Linux Foundation). Production deployments should use HTTP transport with TLS 1.3, implement authentication (API keys or OAuth), add rate limiting and request timeouts, and monitor with Prometheus/Grafana metrics. The official SDKs include enterprise features like structured logging and graceful error handling.

Next Steps #

MCP represents the standardization of AI agent integration—the moment when "connecting AI to tools" shifts from custom engineering to protocol-based configuration. The builders who master this protocol now will architect the multi-agent systems of the next decade.

If you're evaluating AI automation for your team or building agent-powered products, the MCP decision is foundational. Protocol choices made today constrain or enable capability for years. I've walked teams through this evaluation for CRM integrations, n8n orchestration layers, and custom agent platforms—each time, MCP's bidirectional design and growing ecosystem proved decisive.

Book an AI automation strategy call to discuss your specific architecture. We'll map your tool requirements to MCP server patterns, evaluate transport and security models for your deployment context, and build a migration path from existing integrations.

The Complete AI Coding Assistant Showdown — Deep comparison of Cursor, Claude Code, and Google Antigravity, with MCP integration patterns for each
Claude Opus 4.7 Release Guide — Coverage of Claude's latest capabilities and how they integrate with MCP workflows

Resources #

MCP Specification — Official protocol documentation
MCP TypeScript SDK — @modelcontextprotocol/sdk on GitHub
MCP Python SDK — mcp package on PyPI
n8n MCP Community Node — Workflow automation integration

The MCP Architecture Guide: How Model Context Protocol Actually Works

Table of Contents

What Is Model Context Protocol and Why Does It Matter? #

The Core Innovation #

Why 2026 Is the Inflection Point #

What MCP Actually Does #

From My Experience #

The Client-Host-Server Architecture Explained #

The Three Architectural Layers #

Hosts #

Clients #

Servers #

The 1:1 Connection Model #

Stateful Session Management #

How This Differs from Traditional API Architecture #

JSON-RPC 2.0: The Wire Protocol Beneath MCP #

Why JSON-RPC Over Alternatives #

The Four Message Types #

1. Request (Client → Server or Server → Client) #

2. Success Response #

3. Error Response #

4. Notification (One-way, no response expected) #

Method Namespacing Convention #

Batching and Transport Considerations #

Debugging MCP Messages #

MCP Server Capabilities: Tools, Resources, and Prompts #

Tools: Executable Functions #

Resources: Addressable Data #

Prompts: Templated Workflows #

Comparing the Three Primitives #

Real-World Server Examples #

Client Capabilities: Sampling, Roots, and Elicitation #

Sampling: Server-Initiated LLM Access #

Roots: Filesystem Scoping #

Elicitation: Gathering Additional Information #

Client Capability Negotiation #

The Bidirectional Power #

Building an MCP Server: A Production-Ready Implementation #

Server Structure Overview #

TypeScript Implementation #

Key Implementation Patterns #

Python Implementation #

Deployment Considerations #

Testing Your Server #

From My Workflow #

MCP Transport Layer: stdio vs HTTP vs WebSockets #

stdio: The Local Default #

HTTP: Remote and Scalable #

WebSocket: Real-Time Bidirectional #

Transport Comparison Matrix #

Security Considerations by Transport #

Choosing Your Transport #

n8n and Remote MCP #

Tool Design Patterns for AI Agents #

Schema Design Best Practices #

Error Handling Strategies #

Making Tools LLM-Friendly #

Advanced Patterns #

Testing Tools for LLM Compatibility #

From Production Experience #

MCP vs Traditional Function Calling: A Technical Comparison #

OpenAI Function Calling #

Claude Function Tools (Legacy) #

LangChain Tools #

Protocol-Level Architectural Differences #

Migration Path: When to Switch #

Real-World Hybrid Approach #

The Strategic View #

Real-World MCP Implementations and Server Registry #

Official Anthropic Reference Servers #

Community Server Ecosystem #

n8n MCP Integration #

Self-Hosted Server Deployment Patterns #

Evaluating Community Servers #

Building Your Own Server Registry #

From the Field #

Security Model and Permission Architecture #

Capability-Based Security #

Server Isolation Patterns #

User Consent Flows #