✨ Offering No-Obligation (FREE) AI Consulting Meetings Daily! Book a MeetingHere (click me)
The MCP Architecture Guide: How Model Context Protocol Actually Works

The MCP Architecture Guide: How Model Context Protocol Actually Works

May 5, 2026(Updated: May 5, 2026)
45 min read
0 comments
William Spurlock
William Spurlock
AI Solutions Architect

Table of Contents

What Is Model Context Protocol and Why Does It Matter? #

Model Context Protocol (MCP) is an open standard that lets AI systems access external tools, data sources, and workflows through a unified client-server architecture. Anthropic introduced MCP on November 25, 2024, then donated it to the Linux Foundation's Agentic AI Foundation in December 2025—cementing its role as the interoperability layer for AI agent ecosystems.

MCP solves what Anthropic calls the N×M integration problem: previously, connecting N different LLM applications to M different data sources required N×M custom integrations. Every new data source needed bespoke connectors for every AI client. MCP inverts this—write one server, connect any client.

The Core Innovation #

Before MCP, integrating Claude with your codebase meant building a custom tool layer. Integrating it with Slack, Notion, or your CRM meant separate auth flows, separate API wrappers, separate error handling. MCP standardizes these integrations at the protocol level:

Before MCP After MCP
Custom API wrappers per integration Declarative server capabilities
Proprietary tool formats per LLM Universal tools/resources/prompts primitives
Fragmented auth and transport Standardized JSON-RPC 2.0 transport
Vendor-locked implementations Open standard governed by Linux Foundation
One-off integration code Reusable, composable server components

Why 2026 Is the Inflection Point #

Three converging trends make MCP unavoidable for serious AI automation:

  1. Ecosystem maturation: The official MCP server registry now lists 200+ community servers covering everything from PostgreSQL to Shopify to Blender. The long tail of integrations is filling in fast.

  2. Client adoption: Cursor, Claude Desktop, Claude Code, and a growing list of AI applications now ship with built-in MCP client support. The "any client" promise is becoming reality.

  3. Enterprise requirements: Production AI deployments need standardized observability, security boundaries, and multi-agent orchestration. MCP's architecture provides the scaffolding for all three.

What MCP Actually Does #

At its simplest, MCP lets an AI agent:

  • Call tools: Execute functions with structured inputs/outputs (search a database, send a Slack message, deploy to Vercel)
  • Access resources: Read contextual data that changes over time (current stock prices, your filesystem, Jira ticket status)
  • Use prompts: Invoke templated workflows ("analyze this codebase for security issues," "onboard a new hire")

The protocol handles capability negotiation, request routing, error propagation, and lifecycle management—so you build the business logic, not the plumbing.

From My Experience #

I've implemented MCP servers for three production systems in the past quarter: a proprietary CRM connector for a sales automation pipeline, a Notion workspace manager for content operations, and a custom deployment orchestrator for web projects. In each case, the time-to-first-integration dropped by roughly 60% compared to building custom tool layers. The standardization pays for itself immediately when you need to swap or add AI clients.

The Client-Host-Server Architecture Explained #

MCP implements a three-layer architecture where hosts manage clients, clients maintain stateful connections to servers, and servers expose capabilities through a standardized protocol. This design separates connection lifecycle management from capability exposure, enabling complex multi-server deployments without coupling concerns.

The Three Architectural Layers #

┌─────────────────────────────────────────────────────────┐
│                      HOST APPLICATION                    │
│    (Claude Desktop, Cursor, Claude Code, Custom App)   │
│                                                         │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐     │
│  │   CLIENT 1  │  │   CLIENT 2  │  │   CLIENT N  │     │
│  │  (stdio)    │  │   (HTTP)    │  │  (stdio)    │     │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘     │
│         │                │                │             │
│         └────────────────┼────────────────┘             │
│                          │                              │
└──────────────────────────┼──────────────────────────────┘
                           │
           ┌───────────────┼───────────────┐
           │               │               │
    ┌──────┴──────┐ ┌──────┴──────┐ ┌──────┴──────┐
    │  SERVER 1   │ │  SERVER 2   │ │  SERVER 3   │
    │  Filesystem │ │   SQLite    │ │   Custom    │
    │   Server    │ │   Server    │ │   Server    │
    └─────────────┘ └─────────────┘ └─────────────┘

Hosts #

Hosts are LLM applications that initiate and manage connections. A host:

  • Instantiates multiple clients (one per server)
  • Handles user consent and permission flows
  • Routes requests between the LLM and appropriate clients
  • Manages server lifecycle (start, stop, restart)

Examples: Claude Desktop, Cursor IDE, Claude Code CLI, or your custom Next.js application.

Clients #

Clients are connection managers that establish stateful 1:1 sessions with servers. Key characteristics:

  • One client connects to exactly one server
  • Maintains session state including capability negotiation results
  • Handles transport-specific concerns (stdio pipes, HTTP connections)
  • Implements the JSON-RPC protocol layer

Clients are the glue—they don't expose capabilities themselves, but they translate between host expectations and server realities.

Servers #

Servers are capability providers that expose tools, resources, and prompts. Servers:

  • Declare their capabilities during initialization
  • Handle JSON-RPC requests from clients
  • Execute business logic (database queries, API calls, computations)
  • Return structured responses or errors

Servers are stateless with respect to business logic but participate in the stateful session protocol.

The 1:1 Connection Model #

MCP enforces a strict 1:1 relationship between clients and servers. When a host wants to connect to three different capability sources, it spins up three separate client instances:

Connection Client Instance Server Purpose
1 Client A Filesystem Server Read/write local files
2 Client B SQLite Server Query embedded database
3 Client C Custom API Server Internal business logic

This isolation prevents cross-contamination. If your SQLite server crashes, it doesn't affect your filesystem connection. If one server has elevated permissions, its scope is limited to that connection's purpose.

Stateful Session Management #

Unlike REST's stateless request-response model, MCP connections are stateful sessions:

  1. Initialization: Client and server exchange capability declarations
  2. Operation: Bidirectional JSON-RPC messages flow over the transport
  3. Termination: Graceful shutdown or connection drop handling

During initialization, the server announces:

  • Which tools it provides (with JSON schemas)
  • Which resources are available (with URI patterns)
  • Which prompts it supports (with argument schemas)

The client announces what it can offer back:

  • sampling: Can request LLM completions from the host
  • roots: Can provide filesystem root scoping
  • experimental: Feature flags for protocol extensions

How This Differs from Traditional API Architecture #

Aspect REST API MCP Protocol
Connection model Stateless requests Stateful sessions
Capability discovery OpenAPI/static docs Runtime negotiation
Transport HTTP/1.1 or HTTP/2 stdio, HTTP, WebSockets
Error handling HTTP status codes JSON-RPC error objects
Bidirectional Webhooks/polling Native in protocol
Tool exposure Custom endpoints Standardized tools/call

The stateful session model is crucial for AI agent workflows. When Claude asks your filesystem server to "read the last 5 commits and summarize them," that request happens within an established session where permissions, working directory, and context are already negotiated. No need to pass auth tokens and scope parameters with every call.

JSON-RPC 2.0: The Wire Protocol Beneath MCP #

MCP uses JSON-RPC 2.0 as its wire protocol—a lightweight, bidirectional messaging format that supports requests, responses, and notifications over any transport. This choice prioritizes simplicity, debuggability, and universal parser availability over the complexity of binary protocols.

Why JSON-RPC Over Alternatives #

The protocol selection reveals MCP's design priorities:

Protocol Why Not Used MCP's Alternative
REST/HTTP Stateless, verb/noun mismatch with tool semantics Stateful JSON-RPC with explicit method names
gRPC Binary format, schema compilation, stub complexity Human-readable JSON, dynamic capability negotiation
GraphQL Query complexity overkill for tool calls Simple request/response with structured parameters
WebSocket raw No framing, missing request/response correlation JSON-RPC over WebSocket transport

JSON-RPC 2.0 provides exactly what MCP needs:

  • Request-response correlation via id fields
  • Bidirectional messaging (servers can call client capabilities)
  • Human-readable debugging (inspect messages in dev tools)
  • Minimal parser requirements (every language has JSON)

The Four Message Types #

MCP implementations exchange four JSON-RPC message structures:

1. Request (Client → Server or Server → Client) #

{
  "jsonrpc": "2.0",
  "id": "req_001",
  "method": "tools/call",
  "params": {
    "name": "query_database",
    "arguments": {
      "sql": "SELECT * FROM users WHERE active = true",
      "limit": 10
    }
  }
}

Required fields:

  • jsonrpc: Always "2.0"
  • id: String or integer, unique per sender, non-null
  • method: String naming the operation
  • params: Object or array of arguments

2. Success Response #

{
  "jsonrpc": "2.0",
  "id": "req_001",
  "result": {
    "content": [
      {
        "type": "text",
        "text": "[{\"id\": 1, \"name\": \"Alice\"}, ...]"
      }
    ],
    "isError": false
  }
}

The result structure varies by method. Tool calls return content arrays with typed elements (text, images, resource references).

3. Error Response #

{
  "jsonrpc": "2.0",
  "id": "req_001",
  "error": {
    "code": -32602,
    "message": "Invalid params: missing required 'sql' argument",
    "data": {
      "details": "The 'sql' parameter is required for query_database tool"
    }
  }
}

MCP uses standard JSON-RPC error codes with domain-specific extensions:

Code Meaning MCP Context
-32700 Parse error Invalid JSON received
-32600 Invalid request Malformed JSON-RPC structure
-32601 Method not found Server doesn't support this method
-32602 Invalid params Arguments fail schema validation
-32603 Internal error Server crashed during execution
-32000 to -32099 Server errors Implementation-specific issues

4. Notification (One-way, no response expected) #

{
  "jsonrpc": "2.0",
  "method": "notifications/progress",
  "params": {
    "progressToken": "tok_123",
    "progress": 50,
    "total": 100
  }
}

Notifications omit the id field. MCP uses these for:

  • Progress updates during long-running tool calls
  • Resource change notifications ("file was modified")
  • Logging and telemetry streaming

Method Namespacing Convention #

MCP methods follow a hierarchical namespace pattern:

Namespace Purpose Example Methods
initialize/* Session lifecycle initialize, initialized
tools/* Tool operations tools/list, tools/call
resources/* Resource operations resources/list, resources/read
prompts/* Prompt operations prompts/list, prompts/get
notifications/* One-way updates notifications/progress, notifications/message
sampling/* Client capabilities (server → client) sampling/createMessage
roots/* Filesystem scoping roots/list

This namespacing lets clients route messages to appropriate handlers and enables protocol extensions without collision.

Batching and Transport Considerations #

JSON-RPC supports batch requests—an array of request objects sent together—but MCP implementations typically use single messages for clarity. The protocol doesn't mandate batching, and most MCP SDKs don't implement it, prioritizing simpler streaming semantics instead.

Transport layers handle the actual byte movement:

  • stdio: Messages delimited by newlines over stdin/stdout
  • HTTP: POST requests with JSON-RPC bodies, or WebSocket frames
  • WebSocket: Text frames containing JSON-RPC objects

The transport is swappable; the JSON-RPC protocol remains constant.

Debugging MCP Messages #

One significant advantage of JSON-RPC is inspectability. When an MCP integration fails, you can:

  1. Enable debug logging in your client/host
  2. Capture raw JSON-RPC messages
  3. Validate against the spec using standard JSON tools
  4. Replay requests with curl or similar tools

This transparency accelerates development. When I built my first custom MCP server for a proprietary CMS, being able to pipe stdin/stdout to jq and see exactly where my response structure diverged from the spec saved hours of guesswork.

MCP Server Capabilities: Tools, Resources, and Prompts #

MCP servers expose three primary capability primitives: tools (executable functions), resources (addressable data), and prompts (templated workflows). These primitives map cleanly to LLM needs—agents need to do things (tools), know things (resources), and follow structured procedures (prompts).

Tools: Executable Functions #

Tools are the most commonly used MCP primitive. They're executable functions with typed inputs and outputs that AI agents can invoke. Think of tools as the API surface your agent can call.

A tool declaration includes:

{
  "name": "search_customer_database",
  "description": "Query the customer database by email, name, or ID. Returns matching records with purchase history.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "Email, name fragment, or customer ID to search for"
      },
      "includeHistory": {
        "type": "boolean",
        "description": "Whether to include purchase history in results",
        "default": true
      }
    },
    "required": ["query"]
  }
}

Key tool characteristics:

Aspect Specification Best Practice
Naming snake_case or camelCase Use verb-noun patterns (send_email, create_record)
Description 1-3 sentences Be specific about what the tool does and what it returns
Parameters JSON Schema Include types, descriptions, defaults, and constraints
Required fields Explicit array Mark only truly-required params as required
Returns Content array Support text, images, and resource references

Tool execution happens through the tools/call method:

// Request
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "search_customer_database",
    "arguments": {
      "query": "alice@example.com",
      "includeHistory": true
    }
  }
}

// Response
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "Customer found: Alice Smith (ID: 12345). Last purchase: 2026-04-28."
      }
    ],
    "isError": false
  }
}

Resources: Addressable Data #

Resources are contextual data sources that change over time and can be referenced by URI. Unlike tools (which do things), resources are things—files, database records, API endpoints, live streams.

Resources are identified by URIs with semantic schemes:

URI Scheme Example Meaning
file:// file:///home/user/project/src/main.ts Local filesystem path
https:// https://api.example.com/users/123 HTTP-accessible data
db:// db://customers/12345 Database record reference
git:// git://repo/main/src/index.ts Git repository content
Custom internal://workflow/run-789 Application-specific resources

Resource declaration includes metadata, not content:

{
  "uri": "file:///home/user/project/README.md",
  "name": "Project README",
  "description": "Main documentation file for the project",
  "mimeType": "text/markdown"
}

Agents access resources via resources/read:

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "resources/read",
  "params": {
    "uri": "file:///home/user/project/README.md"
  }
}

Resources support subscriptions via resources/subscribe—the server pushes notifications when the resource changes. This enables live-updating contexts like "monitor this log file" or "track this Jira ticket status."

Prompts: Templated Workflows #

Prompts are pre-defined templates that combine instructions, resources, and parameter placeholders into reusable workflows. They're the MCP primitive closest to traditional "system prompts" or "prompt engineering."

A prompt declaration:

{
  "name": "code_review_security",
  "description": "Analyze code files for common security vulnerabilities (SQL injection, XSS, path traversal)",
  "arguments": [
    {
      "name": "file_path",
      "description": "Path to the file to analyze",
      "required": true
    },
    {
      "name": "language",
      "description": "Programming language (auto-detected if not specified)",
      "required": false
    }
  ]
}

When invoked via prompts/get, the server returns the rendered prompt:

{
  "jsonrpc": "2.0",
  "id": 3,
  "result": {
    "description": "Security review prompt for /src/auth/login.ts",
    "messages": [
      {
        "role": "user",
        "content": {
          "type": "text",
          "text": "Please analyze the following TypeScript code for security vulnerabilities. Focus on: SQL injection, XSS, path traversal, and insecure authentication patterns.\n\nFile: /src/auth/login.ts\n\n[resource content inserted here]"
        }
      }
    ]
  }
}

Prompts are powerful for:

  • Standardizing recurring analysis tasks ("review this PR," "summarize this meeting")
  • Packaging institutional knowledge ("onboard a new hire to our API")
  • Creating reusable agentic workflows ("debug this error pattern")

Comparing the Three Primitives #

Primitive Use When Example
Tool The agent needs to take an action Send email, deploy code, create ticket
Resource The agent needs to read current data File contents, API response, database row
Prompt The agent needs a structured procedure Security review, onboarding guide, analysis template

Real-World Server Examples #

The official MCP reference servers demonstrate these primitives:

Filesystem Server (@modelcontextprotocol/server-filesystem):

  • Tools: read_file, write_file, list_directory, search_files
  • Resources: file:// URIs for any accessible file
  • Prompts: None (pure tool/resource server)

SQLite Server (@modelcontextprotocol/server-sqlite):

  • Tools: query (execute SQL), execute (DDL statements)
  • Resources: sqlite:// URIs for table introspection
  • Prompts: Query-building templates

Brave Search Server (@modelcontextprotocol/server-brave-search):

  • Tools: brave_web_search, brave_local_search
  • Resources: None (pure tool server)
  • Prompts: Research workflow templates

When designing your own MCP servers, start with tools—they're the most flexible and most frequently used. Add resources when your server manages addressable data that agents need to reference repeatedly. Add prompts when you find yourself repeating the same multi-step instruction patterns to your AI.

Client Capabilities: Sampling, Roots, and Elicitation #

MCP is bidirectional: while servers expose tools/resources/prompts, clients can offer sampling (LLM access), roots (filesystem scoping), and elicitation (user input gathering) back to servers. This symmetry enables sophisticated agentic patterns where servers delegate cognitive work to the host's LLM.

Sampling: Server-Initiated LLM Access #

Sampling lets an MCP server request LLM completions from the host. This is the protocol's most powerful feature—it allows servers to extend themselves with AI capabilities without embedding their own API keys or model logic.

A server requests sampling via sampling/createMessage:

{
  "jsonrpc": "2.0",
  "id": "samp_001",
  "method": "sampling/createMessage",
  "params": {
    "messages": [
      {
        "role": "user",
        "content": {
          "type": "text",
          "text": "Summarize the following customer feedback into 3 key themes:\n\n[feedback content here]"
        }
      }
    ],
    "modelPreferences": {
      "hints": [{"name": "claude-3-5-sonnet"}],
      "intelligencePriority": 0.7,
      "speedPriority": 0.3
    },
    "systemPrompt": "You are a data analysis assistant. Be concise."
  }
}

The host responds with the LLM output:

{
  "jsonrpc": "2.0",
  "id": "samp_001",
  "result": {
    "model": "claude-3-5-sonnet-20241022",
    "stopReason": "end_turn",
    "role": "assistant",
    "content": {
      "type": "text",
      "text": "1. Shipping delays (45% of mentions)\n2. Product quality concerns (30%)\n3. Positive customer service experiences (25%)"
    }
  }
}

Sampling enables powerful patterns:

Pattern How It Works Example Use Case
Delegating analysis Server fetches raw data, asks LLM to analyze Database query → natural language summary
Multi-step reasoning Server performs action, LLM evaluates result Test execution → failure analysis → fix suggestion
Content transformation Server provides content, LLM reformats Markdown → HTML, raw notes → structured report
Error interpretation Server encounters error, LLM explains Stack trace → human-readable explanation

The modelPreferences field lets servers express what they need without hardcoding model names. Priorities (intelligencePriority, speedPriority, costPriority) guide the host's model selection when multiple options exist.

Roots: Filesystem Scoping #

Roots let servers discover which filesystem locations they should operate within. When a client declares roots capability, the server can query roots/list to understand its sandbox boundaries.

Request:

{
  "jsonrpc": "2.0",
  "id": 4,
  "method": "roots/list"
}

Response:

{
  "jsonrpc": "2.0",
  "id": 4,
  "result": {
    "roots": [
      {
        "uri": "file:///home/user/projects/my-app",
        "name": "Current Project"
      },
      {
        "uri": "file:///home/user/templates",
        "name": "Shared Templates"
      }
    ]
  }
}

This enables scoped filesystem access without hardcoding paths. A filesystem server configured with roots will:

  • Only show files within root directories
  • Reject operations outside roots (unless explicitly permitted)
  • Present root names in UI for clarity

Roots are particularly valuable for:

  • Multi-project hosts: Cursor working on several repositories simultaneously
  • Sandboxed environments: Restricting server access to specific directories
  • Workspace awareness: Letting servers understand project structure

Elicitation: Gathering Additional Information #

Elicitation allows servers to request additional information from the user through the host's UI. When a server needs clarification or missing parameters, it can ask rather than fail.

MCP's elicitation mechanism uses notifications/message with a request for user interaction:

{
  "jsonrpc": "2.0",
  "method": "notifications/message",
  "params": {
    "level": "info",
    "data": "Searching for customer record. If not found, should I create a new one?"
  }
}

While the core protocol supports basic message notifications, sophisticated elicitation (like prompting for form input or confirmation dialogs) is typically handled through:

  1. Tool responses that signal incomplete data
  2. Follow-up tool calls that gather the missing pieces
  3. Host-specific UI integrations

Client Capability Negotiation #

During initialization, clients announce which capabilities they support:

{
  "jsonrpc": "2.0",
  "id": 0,
  "method": "initialize",
  "params": {
    "protocolVersion": "2024-11-05",
    "capabilities": {
      "sampling": {},
      "roots": {
        "listChanged": true
      },
      "experimental": {}
    },
    "clientInfo": {
      "name": "cursor-ide",
      "version": "1.96.0"
    }
  }
}

Servers adapt their behavior based on available client capabilities:

  • If sampling is present, use LLM delegation for complex tasks
  • If roots is present, query for filesystem scope
  • If neither, fall back to simpler patterns

This negotiation makes MCP servers portable across different host environments—a server that uses sampling when available degrades gracefully in hosts that don't support it.

The Bidirectional Power #

The bidirectional design separates MCP from simpler tool-calling schemes. When your n8n workflow server needs to classify an incoming webhook payload, it doesn't need an OpenAI API key—it requests sampling from the host. When your database server needs to summarize query results, it delegates to the LLM rather than implementing its own summarization logic.

This architecture centralizes model management at the host level while distributing capability across specialized servers. The result: cleaner separation of concerns, simpler server implementations, and consistent model behavior across all tools.

Building an MCP Server: A Production-Ready Implementation #

Building an MCP server means implementing three core components: initialization handling, capability declaration, and method routing. The official TypeScript and Python SDKs handle protocol boilerplate, letting you focus on business logic.

Server Structure Overview #

Every MCP server follows this lifecycle:

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│  Transport  │────▶│ Initialize   │────▶│  Capability  │
│  Connected  │     │ Handshake    │     │  Exchange   │
└─────────────┘     └──────────────┘     └─────────────┘
                                                │
                                                ▼
┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│   Cleanup   │◀────│   Request    │◀────│  Operation  │
│  & Exit     │     │   Handling   │     │   Loop      │
└─────────────┘     └──────────────┘     └─────────────┘

TypeScript Implementation #

Here's a production-grade MCP server built with the official TypeScript SDK:

import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
  CallToolRequestSchema,
  ListToolsRequestSchema,
  ErrorCode,
  McpError
} from '@modelcontextprotocol/sdk/types.js';
import { z } from 'zod';

// Business logic: A simple CRM connector
class CRMConnector {
  private customers = new Map<string, any>();
  
  async searchCustomers(query: string): Promise<any[]> {
    return Array.from(this.customers.values())
      .filter(c => c.email.includes(query) || c.name.includes(query));
  }
  
  async getCustomer(id: string): Promise<any | null> {
    return this.customers.get(id) || null;
  }
}

// Schema validation with Zod
const SearchCustomersSchema = z.object({
  query: z.string().min(1).describe('Email or name to search for'),
  limit: z.number().optional().default(10)
});

const GetCustomerSchema = z.object({
  id: z.string().describe('Customer ID')
});

// Create server instance
const server = new Server(
  {
    name: 'crm-connector',
    version: '1.0.0'
  },
  {
    capabilities: {
      tools: {}  // We expose tools
    }
  }
);

const crm = new CRMConnector();

// Handler: List available tools
server.setRequestHandler(ListToolsRequestSchema, async () => {
  return {
    tools: [
      {
        name: 'search_customers',
        description: 'Search customers by email or name. Returns matching records.',
        inputSchema: {
          type: 'object',
          properties: {
            query: { type: 'string', description: 'Email or name to search for' },
            limit: { type: 'number', description: 'Maximum results (default: 10)' }
          },
          required: ['query']
        }
      },
      {
        name: 'get_customer',
        description: 'Get a single customer by ID',
        inputSchema: {
          type: 'object',
          properties: {
            id: { type: 'string', description: 'Customer ID' }
          },
          required: ['id']
        }
      }
    ]
  };
});

// Handler: Execute tool calls
server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const { name, arguments: args } = request.params;
  
  try {
    if (name === 'search_customers') {
      const validated = SearchCustomersSchema.parse(args);
      const results = await crm.searchCustomers(validated.query);
      
      return {
        content: [
          {
            type: 'text',
            text: JSON.stringify(results.slice(0, validated.limit), null, 2)
          }
        ],
        isError: false
      };
    }
    
    if (name === 'get_customer') {
      const validated = GetCustomerSchema.parse(args);
      const customer = await crm.getCustomer(validated.id);
      
      if (!customer) {
        return {
          content: [{ type: 'text', text: 'Customer not found' }],
          isError: true
        };
      }
      
      return {
        content: [
          { type: 'text', text: JSON.stringify(customer, null, 2) }
        ],
        isError: false
      };
    }
    
    throw new McpError(
      ErrorCode.MethodNotFound,
      `Unknown tool: ${name}`
    );
    
  } catch (error) {
    if (error instanceof z.ZodError) {
      return {
        content: [
          { type: 'text', text: `Validation error: ${error.errors.map(e => e.message).join(', ')}` }
        ],
        isError: true
      };
    }
    throw error;
  }
});

// Start server with stdio transport
async function main() {
  const transport = new StdioServerTransport();
  await server.connect(transport);
  console.error('CRM MCP server running on stdio');
}

main().catch(console.error);

Key Implementation Patterns #

Pattern Implementation Rationale
Schema validation Zod for TypeScript, Pydantic for Python Fail fast on invalid inputs, clear error messages
Error handling Return isError: true for expected failures, throw for unexpected Distinguish user errors from system errors
Content typing Use type: 'text' for JSON, plan for type: 'image' Future-proof for multimodal responses
Async operations Always return Promises MCP handles the async/await complexity
Resource cleanup Implement signal handlers Graceful shutdown on SIGTERM

Python Implementation #

The Python SDK follows similar patterns:

import asyncio
from mcp.server import Server
from mcp.server.stdio import stdio_server
from pydantic import BaseModel, Field

# Business logic
class CRMConnector:
    def __init__(self):
        self.customers = {}
    
    async def search_customers(self, query: str, limit: int = 10):
        results = [
            c for c in self.customers.values()
            if query.lower() in c.get('email', '').lower() 
            or query.lower() in c.get('name', '').lower()
        ]
        return results[:limit]

# Schema validation
class SearchCustomersInput(BaseModel):
    query: str = Field(..., description="Email or name to search for")
    limit: int = Field(default=10, description="Maximum results")

# Server setup
app = Server("crm-connector")

@app.list_tools()
async def list_tools():
    return [
        {
            "name": "search_customers",
            "description": "Search customers by email or name",
            "inputSchema": SearchCustomersInput.schema()
        }
    ]

@app.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "search_customers":
        validated = SearchCustomersInput(**arguments)
        crm = CRMConnector()
        results = await crm.search_customers(validated.query, validated.limit)
        return {
            "content": [{"type": "text", "text": str(results)}],
            "isError": False
        }
    
    raise ValueError(f"Unknown tool: {name}")

async def main():
    async with stdio_server() as streams:
        await app.run(streams[0], streams[1])

if __name__ == "__main__":
    asyncio.run(main())

Deployment Considerations #

Environment Transport Configuration
Local dev stdio Package as CLI, invoke directly
Docker stdio or HTTP Container per server, compose orchestration
Kubernetes HTTP Sidecar pattern or dedicated pods
Serverless HTTP Lambda/Function with API Gateway
n8n HTTP Self-hosted n8n with MCP community node

For production deployments, I recommend:

  1. Health check endpoint (for HTTP transports): GET /health returning 200 when ready
  2. Graceful shutdown: Handle SIGTERM, finish in-flight requests
  3. Structured logging: JSON logs with correlation IDs
  4. Metrics exposure: Request counts, latency percentiles, error rates
  5. Secret management: API keys via environment variables, never committed

Testing Your Server #

The MCP Inspector tool validates server compliance:

npx @modelcontextprotocol/inspector node dist/server.js

This opens a web UI where you can:

  • Browse available tools/resources/prompts
  • Invoke tools with custom parameters
  • Inspect raw JSON-RPC messages
  • Validate response formats

Always run the inspector before deploying—it's caught schema mismatches and missing error handling in every server I've built.

From My Workflow #

I keep a template repo with this exact structure for new MCP servers. The boilerplate (SDK imports, transport setup, error handling) stays constant. What changes: the business logic in the connector class, the Zod/Pydantic schemas, and the tool definitions. This separation lets me spin up new integrations in under an hour—whether it's connecting to a new API, wrapping an internal service, or exposing a database.

MCP Transport Layer: stdio vs HTTP vs WebSockets #

MCP abstracts the transport layer—servers and clients communicate via JSON-RPC regardless of whether bytes flow over stdio pipes, HTTP connections, or WebSocket frames. Each transport has distinct tradeoffs for security, latency, and deployment flexibility.

stdio: The Local Default #

stdio (standard input/output) is the default transport for local MCP servers. The host spawns the server process and communicates via stdin/stdout, with stderr reserved for logging.

┌─────────────┐         stdin          ┌─────────────┐
│    HOST     │ ◀───────────────────── │   SERVER    │
│  (Cursor)   │ ─────────────────────▶ │   Process   │
│             │         stdout         │             │
│             │ ─────────────────────▶ │             │
│             │         stderr (logs)  │             │
└─────────────┘                        └─────────────┘

stdio characteristics:

Aspect Behavior Implication
Lifecycle Host manages process start/stop Servers must start quickly and exit cleanly
Security OS process isolation Server runs with host's permissions
State Connection dies with process No reconnection, session is ephemeral
Debugging Pipe through jq, inspect stderr Easy to capture and replay messages
Deployment Local-only Cannot expose over network

stdio works best for:

  • Local filesystem servers
  • Development tools and linters
  • Single-user desktop applications
  • Servers that need OS-level sandboxing

HTTP: Remote and Scalable #

HTTP transport enables network-accessible MCP servers, supporting stateless request-response and SSE (Server-Sent Events) for server-to-client streaming.

HTTP transport supports two connection patterns:

Synchronous mode (simple request-response):

Client                    Server
  │    POST /message      │
  │ ───────────────────▶  │
  │    {jsonrpc: "2.0"...}│
  │                       │
  │    200 OK             │
  │ ◀───────────────────  │
  │    {result: {...}}     │

Streaming mode (with SSE for server→client):

Client                    Server
  │    GET /sse           │
  │ ───────────────────▶  │
  │    (establishes SSE)  │
  │ ◀───────────────────  │
  │    event: endpoint    │
  │    data: /message?session=123
  │                       │
  │    POST /message?...  │
  │ ───────────────────▶  │
  │    (client requests)   │
  │                       │
  │    SSE: server pushes │
  │ ◀───────────────────  │

HTTP characteristics:

Aspect Behavior Implication
Network Accessible remotely Enables SaaS-style MCP services
Load balancing HTTP-native Works with existing infrastructure
Security TLS + auth headers Standard web security applies
State Server maintains session state Can reconnect if session preserved
Scaling Horizontal scaling Multiple server instances behind LB

HTTP works best for:

  • Multi-user SaaS integrations
  • Cloud-hosted AI tools
  • Microservices architecture
  • Scenarios requiring load balancing

WebSocket: Real-Time Bidirectional #

WebSocket transport provides full-duplex communication over a single persistent connection, ideal for high-frequency bidirectional messaging.

WebSocket mode is essentially HTTP transport with WebSocket upgrade—once established, both directions stream freely without polling overhead.

When to prefer WebSocket:

  • High-frequency tool invocations (many calls per second)
  • Real-time resource subscriptions (live dashboards)
  • Low-latency requirements (sub-50ms response times)
  • Chat-like interfaces with streaming responses

Transport Comparison Matrix #

Factor stdio HTTP WebSocket
Local deployment Excellent Good Good
Remote/network Impossible Excellent Excellent
Latency Minimal (~1ms) Moderate (~50ms) Low (~10ms)
Debugging Very easy Moderate Moderate
Security model OS process TLS + auth TLS + auth
Scalability Single machine Excellent Good
Firewall friendly N/A Yes Sometimes
Reconnect handling None Session restore Automatic

Security Considerations by Transport #

stdio security:

  • Servers inherit the host process permissions
  • Use OS-level sandboxing (chroot, containers, permission dropping)
  • Audit server code—it's running with full user privileges
  • Environment variables are shared; be careful with secrets

HTTP/WebSocket security:

  • Always use TLS (wss://, https://) in production
  • Implement authentication: API keys, OAuth, or mTLS
  • Validate origin headers to prevent unwanted cross-origin access
  • Rate limiting and request size limits prevent abuse
  • Session timeout and cleanup prevent resource exhaustion

Example HTTP server with basic auth:

import express from 'express';
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { SSEServerTransport } from '@modelcontextprotocol/sdk/server/sse.js';

const app = express();

// Simple API key auth
const API_KEY = process.env.MCP_API_KEY;
app.use((req, res, next) => {
  const key = req.headers['x-api-key'];
  if (key !== API_KEY) {
    return res.status(401).json({ error: 'Unauthorized' });
  }
  next();
});

// SSE endpoint
app.get('/sse', async (req, res) => {
  const transport = new SSEServerTransport('/message', res);
  const server = new Server({ name: 'http-server', version: '1.0.0' }, 
    { capabilities: { tools: {} } });
  
  // ... set up handlers ...
  
  await server.connect(transport);
});

app.listen(3000);

Choosing Your Transport #

Use stdio when:

  • Building local development tools
  • Need OS-level process isolation
  • Debugging and iterating rapidly
  • Targeting Claude Desktop, Cursor, or similar local hosts

Use HTTP when:

  • Deploying to production SaaS
  • Serving multiple clients/users
  • Need load balancing and horizontal scaling
  • Integrating with existing web infrastructure

Use WebSocket when:

  • Real-time bidirectional streaming is required
  • Minimizing latency overhead matters
  • Building chat-like or live-update interfaces

In practice, most of my production servers use HTTP with SSE for server→client pushes. stdio is reserved for local development and single-user tools. I've yet to need WebSocket specifically—HTTP SSE provides sufficient real-time capability for most agentic workflows.

n8n and Remote MCP #

The n8n MCP community node currently supports HTTP transport, letting you wire remote MCP servers into workflows. This enables patterns like:

  • Calling external AI services through MCP protocol
  • Chaining multiple MCP servers in a single workflow
  • Using MCP tools alongside native n8n nodes

When self-hosting n8n with MCP support, configure your servers with HTTP transport and appropriate authentication. The node handles the JSON-RPC layer, exposing MCP tools as standard n8n workflow operations.

Tool Design Patterns for AI Agents #

Well-designed MCP tools follow consistent patterns: descriptive naming, explicit schemas, graceful error handling, and output formatting optimized for LLM consumption. The goal is making tools discoverable, predictable, and composable for AI agents.

Schema Design Best Practices #

Tool schemas are contracts between human developers and AI systems. Follow these patterns:

1. Use verb-noun naming consistently

Poor Better Best
data get_data get_customer_record
user user_search search_users_by_email
process process_file extract_text_from_pdf

Specific verbs (search, create, update, delete, send, extract) communicate intent. Specific nouns clarify domain scope.

2. Structure parameters for LLM understanding

{
  "properties": {
    "customer_id": {
      "type": "string",
      "description": "Unique customer identifier (e.g., 'cust_12345'). Found in customer record or previous search results."
    },
    "include_inactive": {
      "type": "boolean",
      "description": "If true, includes deactivated customer accounts. Defaults to false (only active customers)."
    }
  },
  "required": ["customer_id"]
}

Each description should answer:

  • What is this parameter? (type, format)
  • Where does the value come from? (context for the LLM)
  • What's the default if omitted?

3. Group related parameters with objects

Instead of flat parameters:

{
  "street": "...",
  "city": "...",
  "zip": "..."
}

Use nested structures:

{
  "address": {
    "street": "...",
    "city": "...",
    "zip": "..."
  }
}

This improves schema readability and lets you reuse object definitions across tools.

4. Enumerate when values are constrained

{
  "priority": {
    "type": "string",
    "enum": ["low", "medium", "high", "urgent"],
    "description": "Ticket priority level"
  }
}

Explicit enums prevent hallucinated values and make tool behavior predictable.

Error Handling Strategies #

MCP distinguishes two error categories: user errors (returned with isError: true) and system errors (thrown as JSON-RPC errors).

User errors (invalid input, business rule violations):

return {
  content: [
    {
      type: 'text',
      text: `Customer not found: No customer exists with ID "${customerId}". Use search_customers to find valid IDs.`
    }
  ],
  isError: true
};

System errors (bugs, network failures, crashes):

throw new McpError(
  ErrorCode.InternalError,
  'Database connection failed',
  { retryable: true, details: error.message }
);

Error message best practices:

Scenario Approach Example
Invalid input Explain what's wrong, suggest fix "Invalid date format. Use YYYY-MM-DD (e.g., 2026-05-05)"
Not found Confirm the search term, suggest alternatives "No user found with email 'alice@test.com'. Did you mean 'alice@example.com'?"
Permission denied Explain required permissions "Access denied. Need 'admin' role to delete records."
Rate limited Include retry timing "Rate limit exceeded. Retry after 60 seconds."
Ambiguous request List options, ask for clarification "Found 3 projects named 'Website'. Specify: Website-2025, Website-2026, or Website-Redesign?"

Making Tools LLM-Friendly #

LLMs work best with tools that follow predictable patterns. Design for agentic workflows:

1. Prefer small, composable tools over monolithic ones

Instead of one massive manage_customer tool:

// Composable, chainable tools:
search_customers       // Find candidates
create_customer        // Create new
get_customer           // Retrieve by ID
update_customer        // Modify existing
delete_customer        // Remove

This lets agents reason step-by-step: search → verify → act.

2. Return structured data in consistent formats

{
  "content": [
    {
      "type": "text",
      "text": "{\n  \"found\": 3,\n  \"customers\": [\n    {\"id\": \"c1\", \"name\": \"Alice\", \"email\": \"a@example.com\"},\n    {\"id\": \"c2\", \"name\": \"Bob\", \"email\": \"b@example.com\"},\n    {\"id\": \"c3\", \"name\": \"Carol\", \"email\": \"c@example.com\"}\n  ]\n}"
    }
  ]
}

Valid JSON in text content lets the LLM parse and reference results in subsequent calls.

3. Include pagination for list operations

interface ListInput {
  limit?: number;      // Default: 20, Max: 100
  offset?: number;     // For pagination
  cursor?: string;     // Alternative to offset
}

// Return metadata with results
{
  "data": [...],
  "pagination": {
    "total": 156,
    "returned": 20,
    "next_cursor": "eyJpZCI6IDIxfQ=="
  }
}

4. Support idempotency where possible

interface CreateInput {
  name: string;
  idempotency_key?: string;  // Client-generated UUID
}

Idempotency keys let agents retry failed calls safely.

Advanced Patterns #

Conditional tool availability:

server.setRequestHandler(ListToolsRequestSchema, async () => {
  const tools = [baseTools];
  
  // Add admin tools only for privileged sessions
  if (session.isAdmin) {
    tools.push(adminTools);
  }
  
  // Add beta tools for early access users
  if (session.features.includes('beta')) {
    tools.push(experimentalTools);
  }
  
  return { tools };
});

Tool chaining with dependencies:

// Tool A returns data that Tool B requires
interface ToolAOutput {
  result_id: string;  // Pass this to Tool B
  preview: string;     // LLM sees this, decides next step
}

interface ToolBInput {
  result_id: string;  // Reference to Tool A's output
  action: 'confirm' | 'modify';
}

Context-aware tools:

interface SearchInput {
  query: string;
  context?: string;  // Additional context from conversation
}

// Usage: agent passes conversation context to improve results
// query: "users who purchased"
// context: "The user is asking about customers from last month's campaign"

Testing Tools for LLM Compatibility #

Before shipping a tool, test with these prompts:

  1. "What tools are available and what do they do?" — Verifies description clarity
  2. "Search for X using the search tool" — Tests parameter understanding
  3. "After finding X, perform Y on it" — Tests chaining capability
  4. "Try to do Z with invalid input" — Tests error handling
  5. "Do the same operation twice" — Tests idempotency

If the LLM struggles with any of these, your tool design needs refinement.

From Production Experience #

The best MCP tools I've built share a common trait: they're boringly predictable. Same parameter names across tools (id, limit, offset). Same response structure ({success: boolean, data: any, error?: string}). Same error message format. This consistency lets agents build reliable mental models—when they learn one tool, they can infer how others work.

The worst tools? Overly clever parameter names, inconsistent return formats, and ambiguous error messages that leave the agent guessing what went wrong. Save the creativity for your business logic, not your interface design.

MCP vs Traditional Function Calling: A Technical Comparison #

MCP represents a fundamental architectural shift from function calling—stateful bidirectional sessions replace stateless tool lists, runtime capability negotiation replaces static schemas, and server-hosted logic replaces client-side tool definitions. Understanding these differences helps you choose the right integration pattern.

OpenAI Function Calling #

OpenAI's function calling (introduced June 2023) embeds tool definitions in the chat completion API:

# OpenAI pattern: Tools defined client-side
response = client.chat.completions.create(
    model="gpt-4",
    messages=messages,
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    }],
    tool_choice="auto"
)

Key characteristics:

Aspect OpenAI Functions MCP Equivalent
Tool location Defined in API request Discovered from server
Transport HTTP POST to OpenAI JSON-RPC over configurable transport
Execution Client receives tool_call, executes locally Server receives request, executes remotely
Tool count Limited by token budget (~128 functions) Unlimited (servers paginate)
Bidirectional No—one-way LLM→client Yes—server can call sampling
State Stateless per request Stateful session

When OpenAI functions win:

  • Simple integrations with few tools
  • No need for persistent connections
  • Immediate deployment without server infrastructure
  • Working within OpenAI's ecosystem exclusively

When MCP wins:

  • Many tools or dynamically changing tool sets
  • Need for server-side logic and state
  • Multi-LLM support (same server, different clients)
  • Bidirectional capabilities (sampling, roots)

Claude Function Tools (Legacy) #

Claude's pre-MCP tool use followed patterns similar to OpenAI:

// Legacy Claude tool use
const message = await anthropic.messages.create({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  tools: [
    {
      name: 'get_stock_price',
      description: 'Get current stock price',
      input_schema: {
        type: 'object',
        properties: {
          symbol: { type: 'string' }
        }
      }
    }
  ],
  messages
});

Claude's shift to MCP:

As of late 2024, Claude Desktop and Claude Code transitioned to MCP as their native integration protocol. The legacy tool use API still works for direct API calls, but MCP is now the recommended architecture for:

  • Claude Desktop extensions
  • Local tool integrations
  • Multi-tool environments

The key difference: MCP servers can live anywhere (local process, remote HTTP service, container), while legacy tool definitions were always embedded in API requests.

LangChain Tools #

LangChain provides a Python/TypeScript framework for tool orchestration:

from langchain.tools import BaseTool
from pydantic import BaseModel, Field

class SearchCustomersInput(BaseModel):
    query: str = Field(description="Search query")

class SearchCustomersTool(BaseTool):
    name = "search_customers"
    description = "Search customers"
    args_schema = SearchCustomersInput
    
    def _run(self, query: str):
        return db.search(query)

LangChain vs MCP comparison:

Factor LangChain MCP
Type Framework/library Protocol/specification
Scope End-to-end orchestration Server/client protocol
Tool definition Python/TS classes JSON-RPC over transport
Interoperability LangChain ecosystem Any MCP-compatible client
Deployment Library in application Standalone server process
Language Python/TypeScript Any (SDKs available)

They're complementary, not competing:

LangChain can wrap MCP servers:

from langchain_mcp import MCPTool

# Use MCP server as LangChain tool
mcp_tool = MCPTool.from_server(
    command="python",
    args=["my_mcp_server.py"],
    tool_name="search_customers"
)

# Use in LangChain agent
agent = initialize_agent([mcp_tool, other_tools], llm)

This lets you:

  • Keep existing LangChain orchestration
  • Add MCP servers for standardized integrations
  • Gradually migrate from custom tools to MCP

Protocol-Level Architectural Differences #

Stateful vs Stateless:

OpenAI:    Request → Tools defined → Execute → End
           (Stateless, per-request)

MCP:       Connect → Handshake → Operation → Operation → ... → Disconnect
           (Stateful, persistent session)

Stateful sessions enable:

  • Runtime capability changes (server adds tools dynamically)
  • Resource subscriptions (live updates)
  • Bidirectional sampling (server asks LLM questions)
  • Connection-scoped permissions and roots

Transport flexibility:

Approach Transport Options
OpenAI HTTP only (their API)
Claude Legacy HTTP only (their API)
LangChain In-process, HTTP, custom
MCP stdio, HTTP, WebSocket, custom

Discovery mechanism:

OpenAI:   Developer hardcodes tool definitions in code
          (tools = [...])

MCP:      Client queries server at runtime
          tools/list → Server returns current capabilities

Runtime discovery matters when:

  • Tool availability varies by user permissions
  • Tools are plugins loaded dynamically
  • Same server serves different capabilities to different clients

Migration Path: When to Switch #

Keep using traditional function calling when:

  • Building prototypes or MVPs
  • Only 2-3 simple tools needed
  • Single LLM provider (OpenAI only)
  • No need for persistent connections

Adopt MCP when:

  • Tool count exceeds 10-20
  • Same tools needed across multiple LLM clients
  • Tools require server-side state or logic
  • Building reusable integrations for distribution
  • Need bidirectional capabilities (sampling, roots)
  • Building SaaS products that expose AI-accessible APIs

Real-World Hybrid Approach #

Most production systems I work on use both:

  • MCP servers: Standard integrations (database, CRM, search) that multiple clients need
  • Client-side functions: Application-specific logic tied to UI state, ephemeral operations

Example architecture:

┌─────────────────────────────────────────┐
│           Application (Cursor)           │
│                                          │
│  ┌─────────────────┐  ┌──────────────┐   │
│  │ UI State Tools  │  │ MCP Client   │───┼──▶ Database MCP Server
│  │ (client-side)   │  │              │   │
│  │ - render_preview│  │              │───┼──▶ Search MCP Server
│  │ - validate_input│  │              │   │
│  └─────────────────┘  └──────────────┘   │
│                                          │
└─────────────────────────────────────────┘

Client-side tools handle UI-specific concerns. MCP servers handle reusable business logic. The boundary moves based on how broadly useful a capability is.

The Strategic View #

MCP isn't replacing function calling—it's standardizing the protocol layer beneath it. Think of it like HTTP: you can still build custom protocols, but HTTP wins for interoperability. MCP aims to be the HTTP of AI tool integration.

Anthropic's donation to the Linux Foundation signals this intent. The goal isn't Anthropic-specific tooling—it's a universal standard that OpenAI, Google, Meta, and others can adopt. When that happens, building an MCP server means your tools work everywhere. That's worth the architectural shift.

Real-World MCP Implementations and Server Registry #

The MCP ecosystem includes 200+ servers ranging from official Anthropic reference implementations to community-built integrations for databases, APIs, developer tools, and business applications. Understanding the available servers accelerates your own MCP adoption.

Official Anthropic Reference Servers #

Anthropic maintains a set of reference implementations demonstrating best practices:

Package Capabilities Use Case
@modelcontextprotocol/server-filesystem Tools, Resources Read/write local files, directory traversal
@modelcontextprotocol/server-sqlite Tools, Resources Query SQLite databases, schema introspection
@modelcontextprotocol/server-brave-search Tools Web and local search via Brave API
@modelcontextprotocol/server-fetch Tools HTTP request proxy with content extraction
@modelcontextprotocol/server-github Tools Repository operations, issues, PRs
@modelcontextprotocol/server-postgres Tools, Resources PostgreSQL queries and exploration
@modelcontextprotocol/server-slack Tools Send messages, search channels
@modelcontextprotocol/server-puppeteer Tools Browser automation, screenshots

Installation via npx (stdio transport):

# Filesystem server
npx -y @modelcontextprotocol/server-filesystem /path/to/allowed/dir

# SQLite server
npx -y @modelcontextprotocol/server-sqlite /path/to/database.db

# Brave search (requires API key)
BRAVE_API_KEY=xxx npx -y @modelcontextprotocol/server-brave-search

These servers use stdio transport by default, making them ideal for local development with Claude Desktop or Cursor.

Community Server Ecosystem #

The community registry (accessible via mcp.run and GitHub topics) includes servers for:

Databases & Storage:

  • MongoDB, Redis, DynamoDB
  • Supabase, Firebase
  • ClickHouse, DuckDB

Developer Tools:

  • Docker, Kubernetes
  • Git operations (beyond GitHub)
  • Testing frameworks (Jest, pytest)
  • Code quality (ESLint, Prettier, Ruff)

Business Applications:

  • Salesforce, HubSpot, Pipedrive
  • Stripe, PayPal
  • Notion, Asana, Linear, Monday
  • Shopify, WooCommerce

AI & Data:

  • Vector databases (Pinecone, Weaviate, Chroma)
  • LLM API proxies (OpenAI, Anthropic, Gemini)
  • Data processing (Pandas, Polars transformations)

Search & Knowledge:

  • Elasticsearch, Meilisearch
  • Confluence, Notion wikis
  • Custom knowledge bases

n8n MCP Integration #

n8n (the workflow automation platform) supports MCP through its MCP community node, enabling AI agents to orchestrate complex multi-step automations.

Architecture with n8n:

┌─────────────┐      MCP HTTP       ┌─────────────┐
│   Claude    │ ◀─────────────────▶ │    n8n      │
│   Desktop   │    JSON-RPC         │   MCP Node  │
└─────────────┘                     └──────┬──────┘
                                           │
                          ┌────────────────┼────────────────┐
                          │                │                │
                          ▼                ▼                ▼
                    ┌──────────┐     ┌──────────┐     ┌──────────┐
                    │ HTTP API │     │ Database │     │  Custom  │
                    │ Request  │     │  Query   │     │   Logic  │
                    └──────────┘     └──────────┘     └──────────┘

Use cases for n8n + MCP:

  1. Multi-step API orchestration: Claude requests "onboard a new customer," n8n executes: create CRM record → send welcome email → provision trial account → notify Slack

  2. Conditional logic: MCP tools can include n8n workflow decisions without hardcoding them in the prompt

  3. Integration hub: Single MCP server endpoint exposing dozens of integrated services

  4. Audit and logging: n8n captures every tool invocation with full traceability

Configuration example in n8n:

{
  "mcpServers": {
    "n8n-workflows": {
      "url": "https://n8n.your-domain.com/mcp",
      "headers": {
        "Authorization": "Bearer xxx"
      }
    }
  }
}

Self-Hosted Server Deployment Patterns #

For production use, you'll typically self-host MCP servers rather than relying on npx:

Docker Compose stack:

version: '3.8'
services:
  mcp-filesystem:
    image: mcp/filesystem:latest
    volumes:
      - ./data:/data:ro
    environment:
      - ALLOWED_DIRS=/data
    # Exposes HTTP transport on port 3001
    ports:
      - "3001:3000"
  
  mcp-database:
    image: mcp/postgres:latest
    environment:
      - DATABASE_URL=postgresql://db:5432/prod
    ports:
      - "3002:3000"
  
  # n8n with MCP node
  n8n:
    image: n8nio/n8n:latest
    environment:
      - N8N_BASIC_AUTH_ACTIVE=true
    volumes:
      - n8n_data:/home/node/.n8n
    ports:
      - "5678:5678"

Kubernetes deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mcp-server
spec:
  replicas: 2
  selector:
    matchLabels:
      app: mcp-customer-api
  template:
    metadata:
      labels:
        app: mcp-customer-api
    spec:
      containers:
      - name: server
        image: your-registry/mcp-customer-api:v1.2.0
        env:
        - name: API_KEY
          valueFrom:
            secretKeyRef:
              name: mcp-secrets
              key: api-key
        ports:
        - containerPort: 3000
        readinessProbe:
          httpGet:
            path: /health
            port: 3000

Evaluating Community Servers #

Before adopting a community MCP server, evaluate:

Criteria What to Check Red Flags
Maintenance Last commit, issue response time >6 months inactive
Security Auth implementation, secret handling Hardcoded credentials, no auth
Completeness Which MCP primitives implemented Missing error handling
Documentation README, examples, parameter docs No usage examples
Test coverage Unit tests, integration tests No test suite
Transport support stdio, HTTP, or both Only stdio (limits deployment)

For production systems, I prefer to fork and audit community servers rather than depend on them directly. The protocol is simple enough that maintaining a fork is lower risk than trusting unaudited code with production data access.

Building Your Own Server Registry #

For organizations with multiple internal MCP servers, consider maintaining a private registry:

{
  "servers": [
    {
      "name": "internal-crm",
      "transport": "http",
      "url": "https://mcp-crm.internal.company.com",
      "auth": "api_key_header",
      "environments": ["prod", "staging"]
    },
    {
      "name": "deploy-orchestrator",
      "transport": "stdio",
      "command": "python -m mcp.deploy",
      "env": { "KUBECONFIG": "/etc/kube/config" }
    }
  ]
}

This registry document becomes your organization's MCP service catalog—what's available, how to connect, and which environments are supported.

From the Field #

In my current client work, the MCP servers getting the most use are:

  1. A custom CRM connector (built on the pattern from Section 6) that bridges a proprietary customer database with Claude Desktop for sales analysis

  2. An n8n-hosted orchestration server that wraps 15+ SaaS integrations into a unified tool set

  3. A filesystem + git combo for codebase operations—let Claude read files, make changes, commit with conventional messages, and push branches

The pattern that works: official servers for standard needs (filesystem, SQLite, Brave), custom servers for proprietary systems, and n8n for complex multi-step workflows. This covers 90% of integration scenarios without reinventing wheels.

Security Model and Permission Architecture #

MCP's security model combines capability-based permissions, transport-layer isolation, and host-mediated user consent to create defense in depth for AI agent integrations. Understanding these layers is critical for production deployments.

Capability-Based Security #

MCP uses capability-based security where servers declare what they can do, clients declare what they can offer, and users approve the intersection. Nothing happens without explicit capability exchange.

The capability negotiation flow:

1. Client sends:    I support sampling, roots
                     I need tools, resources

2. Server responds:   I provide tools, resources
                     I need sampling (optional)

3. User confirms:   Approve this server to use tools?
                     (Host presents UI)

4. Operations:      Both sides respect agreed boundaries

Server-side capabilities (what the server offers):

  • tools: Can execute functions on request
  • resources: Can provide addressable data
  • prompts: Can offer templated workflows
  • logging: Can emit structured log events

Client-side capabilities (what the client offers back):

  • sampling: Can request LLM completions
  • roots: Can provide filesystem scoping
  • experimental: Feature flags for preview features

This negotiation means a server can't call sampling/createMessage unless the client announced sampling support in initialization. The protocol enforces boundaries at the wire level.

Server Isolation Patterns #

Each MCP connection runs in an isolated trust boundary. A compromised server cannot affect other servers or the host beyond its declared capabilities.

Isolation mechanisms by transport:

Transport Isolation Mechanism Scope
stdio OS process boundaries Server can only access process permissions
HTTP Network boundaries + auth Server can only access network-granted resources
WebSocket Same as HTTP Same as HTTP

stdio isolation best practices:

// Drop permissions after starting (Node.js)
import { userInfo } from 'os';

if (process.getuid && process.getuid() === 0) {
  const targetUser = userInfo();
  process.setgid(targetUser.gid);
  process.setuid(targetUser.uid);
}
# Container security (Docker)
FROM node:20-alpine
RUN addgroup -g 1001 -S mcp && adduser -u 1001 -S mcp -G mcp
USER mcp
# Server now runs as non-root

HTTP isolation best practices:

  • Run servers in separate containers/VMs
  • Use network policies to restrict egress
  • Implement request timeouts and size limits
  • Rotate authentication credentials regularly

The host mediates all capability access through user consent interfaces. MCP deliberately doesn't automate consent—every server connection requires explicit user approval.

Typical consent UI patterns:

┌─────────────────────────────────────────┐
│  New Server Connection Request          │
│                                         │
│  Server: "Company CRM MCP"              │
│  Origin: https://mcp.company.com        │
│                                         │
│  This server wants to:                  │
│    ✓ Execute tools (5 available)        │
│    ✓ Access resources (database)        │
│    ✓ Request AI assistance              │
│                                         │
│  [View Details]  [Deny]  [Allow]        │
└─────────────────────────────────────────┘

Granular permissions (implemented by quality hosts):

Level Control Example
Connection Allow/deny server entirely Block untrusted origins
Capability Enable/disable primitives Allow tools but not sampling
Tool Per-tool allowlist Allow search_customers, deny delete_customer
Resource URI pattern filtering Allow file://docs/*, deny file://secrets/*

Claude Desktop implements progressive consent—first connection requires approval, subsequent reconnections may be automatic for trusted servers, and sensitive operations trigger re-confirmation.

Production Security Checklist #

Before deploying MCP servers to production, verify:

Authentication & Authorization:

  • HTTP servers require authentication (API key, OAuth, mTLS)
  • API keys rotate regularly and store in secrets managers
  • Servers validate caller identity before executing sensitive tools
  • No hardcoded credentials in server code or configs

Transport Security:

  • TLS 1.3 for all HTTP/WebSocket connections
  • Certificate pinning for internal services
  • stdio servers run with minimal OS permissions
  • Network policies restrict server egress

Input Validation:

  • JSON Schema validation on all inputs
  • Parameter sanitization (prevent injection attacks)
  • Size limits on request/response payloads
  • Timeout guards on long-running operations

Audit & Monitoring:

  • Log all tool invocations with caller identity
  • Alert on error rate spikes
  • Track access patterns for anomaly detection
  • Retain audit logs for compliance requirements

Error Handling:

  • Sanitize error messages (don't leak internal paths)
  • Distinguish user errors from system errors
  • Fail closed (deny on uncertainty)
  • Rate limiting on expensive operations

Threat Model #

Primary threats and mitigations:

Threat Impact Mitigation
Server compromise Attacker gains server capabilities Process isolation, minimal permissions, input validation
Man-in-the-middle Intercept/modify JSON-RPC TLS 1.3, certificate pinning
Credential theft Unauthorized server access Secrets management, rotation, short-lived tokens
Prompt injection Attacker controls tool via crafted input Input validation, output encoding, context limits
Resource exhaustion DoS via expensive operations Rate limiting, timeouts, quotas
Data exfiltration Sensitive data via tool responses Data classification, output filtering, audit logging

Prompt injection deserves special attention in MCP contexts. Since LLMs generate tool arguments, an attacker who controls the LLM's context (via a malicious webpage, PDF, or conversation) might craft inputs that abuse available tools.

Defense patterns:

// Validate against expected patterns
const SAFE_EMAIL = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
if (!SAFE_EMAIL.test(args.email)) {
  throw new Error('Invalid email format');
}

// Escape outputs that will be rendered
function escapeHtml(text: string): string {
  return text
    .replace(/&/g, '&amp;')
    .replace(/</g, '&lt;')
    .replace(/>/g, '&gt;');
}

// Limit context window for sensitive tools
const MAX_CONTEXT = 4000;
if (args.context.length > MAX_CONTEXT) {
  args.context = args.context.slice(0, MAX_CONTEXT) + '...[truncated]';
}

Security-First Deployment Example #

Here's a hardened production MCP server configuration:

import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import helmet from 'helmet';
import rateLimit from 'express-rate-limit';

// Rate limiting
const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // limit each IP to 100 requests per windowMs
  standardHeaders: true,
  legacyHeaders: false,
});

// API key validation
function validateApiKey(req: Request, res: Response, next: NextFunction) {
  const key = req.headers['x-api-key'];
  const validKey = process.env.MCP_API_KEY;
  
  if (!validKey) {
    console.error('MCP_API_KEY not configured');
    return res.status(500).json({ error: 'Server misconfigured' });
  }
  
  if (key !== validKey) {
    return res.status(401).json({ error: 'Invalid API key' });
  }
  
  next();
}

// Input size limits
app.use(express.json({ limit: '1mb' }));

// Security headers
app.use(helmet());

// Apply to all MCP endpoints
app.use('/mcp', limiter, validateApiKey);

This layering—rate limits, auth, size constraints, security headers—provides defense in depth against common attack vectors.

From Experience #

The security incidents I've seen in MCP deployments aren't exotic protocol exploits—they're basic oversights: servers running as root, API keys committed to GitHub, missing input validation that lets users read arbitrary files. MCP's protocol security is sound; implementation discipline matters more.

My rule: treat MCP servers with the same security rigor as any production API. They have the same blast radius—often more, since they're designed for automated access.

Deploying MCP at Scale: Production Patterns #

Production MCP deployments require orchestrating multiple servers, managing connection lifecycles, implementing health monitoring, and designing for failure modes that don't exist in single-server local development. These patterns bridge from "it works on my machine" to "it works at enterprise scale."

Multi-Server Orchestration #

Enterprise deployments typically involve 10-50+ MCP servers serving different teams, systems, and use cases. Managing this fleet requires a registry pattern and connection management layer.

Server registry architecture:

┌──────────────────────────────────────────────────────────┐
│                      HOST LAYER                           │
│  ┌─────────────────┐                                    │
│  │  MCP Gateway    │◀── Connection management            │
│  │  (Registry)     │◀── Health monitoring               │
│  └────────┬────────┘◀── Request routing                  │
│           │                                              │
│  ┌────────┴────────┐  ┌─────────────┐  ┌─────────────┐ │
│  │ Client Pool A   │  │ Client Pool B│  │ Client Pool C│ │
│  │ (CRM servers)   │  │ (Dev tools)  │  │ (Analytics)  │ │
│  └────────┬────────┘  └──────┬───────┘  └──────┬────────┘ │
└───────────┼──────────────────┼──────────────────┼──────────┘
            │                  │                  │
     ┌──────┴──────┐    ┌──────┴──────┐   ┌──────┴──────┐
     │ CRM Server 1│    │ Git Server  │   │ BI Server   │
     │ CRM Server 2│    │ Docker Svr  │   │ Metrics Svr │
     │ CRM Server 3│    │ K8s Server  │   │ Logs Server │
     └─────────────┘    └─────────────┘   └─────────────┘

Registry implementation (simplified):

interface ServerRegistration {
  id: string;
  name: string;
  transport: 'stdio' | 'http' | 'websocket';
  connection: ConnectionConfig;
  capabilities: ServerCapabilities;
  healthStatus: 'healthy' | 'degraded' | 'unhealthy';
  lastHealthCheck: Date;
  metadata: {
    team: string;
    environment: 'prod' | 'staging' | 'dev';
    priority: number;
  };
}

class MCPGateway {
  private servers = new Map<string, ServerRegistration>();
  private clients = new Map<string, MCPClient>();
  
  async registerServer(config: ServerRegistration): Promise<void> {
    // Create connection
    const client = await this.createClient(config);
    this.clients.set(config.id, client);
    this.servers.set(config.id, config);
    
    // Start health monitoring
    this.startHealthChecks(config.id);
  }
  
  async routeRequest(serverId: string, request: JSONRPCRequest): Promise<JSONRPCResponse> {
    const client = this.clients.get(serverId);
    if (!client || !client.isConnected()) {
      throw new Error(`Server ${serverId} unavailable`);
    }
    return client.sendRequest(request);
  }
}

Connection Pooling and Lifecycle #

Stateful MCP connections benefit from pooling, reconnection logic, and graceful degradation.

Connection states:

DISCONNECTED → CONNECTING → INITIALIZING → READY → OPERATIONAL
     ↑           │              │            │         │
     └───────────┴──────────────┴────────────┴─────────┘
                    (reconnection cycle)

Connection pool manager:

interface PooledConnection {
  id: string;
  client: MCPClient;
  state: ConnectionState;
  createdAt: Date;
  lastUsed: Date;
  requestCount: number;
  errorCount: number;
}

class ConnectionPool {
  private pool: Map<string, PooledConnection> = new Map();
  private maxConnections = 10;
  private maxRequestsPerConnection = 1000;
  private idleTimeoutMs = 5 * 60 * 1000; // 5 minutes
  
  async acquire(serverId: string): Promise<PooledConnection> {
    // Find healthy existing connection
    const existing = this.findHealthyConnection(serverId);
    if (existing) {
      existing.lastUsed = new Date();
      return existing;
    }
    
    // Create new if under limit
    if (this.pool.size < this.maxConnections) {
      return this.createConnection(serverId);
    }
    
    // Evict oldest and create new
    await this.evictOldest();
    return this.createConnection(serverId);
  }
  
  async release(conn: PooledConnection): Promise<void> {
    // Return to pool, schedule idle cleanup
    setTimeout(() => this.checkIdle(conn), this.idleTimeoutMs);
  }
  
  private async checkIdle(conn: PooledConnection): Promise<void> {
    const idleTime = Date.now() - conn.lastUsed.getTime();
    if (idleTime > this.idleTimeoutMs) {
      await this.closeConnection(conn);
    }
  }
}

Reconnection strategy with exponential backoff:

class ResilientClient {
  private reconnectAttempts = 0;
  private maxReconnectAttempts = 5;
  private baseDelayMs = 1000;
  private maxDelayMs = 30000;
  
  async connect(): Promise<void> {
    try {
      await this.doConnect();
      this.reconnectAttempts = 0;
    } catch (error) {
      if (this.reconnectAttempts < this.maxReconnectAttempts) {
        const delay = Math.min(
          this.baseDelayMs * Math.pow(2, this.reconnectAttempts),
          this.maxDelayMs
        );
        this.reconnectAttempts++;
        
        console.log(`Reconnect attempt ${this.reconnectAttempts} in ${delay}ms`);
        await sleep(delay);
        return this.connect();
      }
      throw error;
    }
  }
}

Health Checks and Monitoring #

Production MCP requires health checking at multiple levels:

Check Level What to Verify Frequency
Transport TCP/WebSocket connected Every 30s
Protocol JSON-RPC round-trip works Every 60s
Capability Tools/resources respond Every 5 min
Business End-to-end tool execution Every 5 min

Health check implementation:

async function healthCheck(client: MCPClient): Promise<HealthStatus> {
  try {
    // Protocol-level ping
    const ping = await client.sendRequest({
      jsonrpc: '2.0',
      id: 'health-ping',
      method: 'ping',
      params: {}
    });
    
    if (ping.error) {
      return { status: 'unhealthy', reason: 'ping_failed' };
    }
    
    // Capability check: list tools
    const tools = await client.sendRequest({
      jsonrpc: '2.0',
      id: 'health-tools',
      method: 'tools/list',
      params: {}
    });
    
    if (tools.error) {
      return { status: 'degraded', reason: 'tools_unavailable' };
    }
    
    return { status: 'healthy', toolCount: tools.result?.tools?.length };
  } catch (error) {
    return { status: 'unhealthy', reason: error.message };
  }
}

Metrics to track:

interface MCPMetrics {
  // Connection metrics
  connectionsActive: Gauge;
  connectionsTotal: Counter;
  connectionErrors: Counter;
  reconnectionAttempts: Counter;
  
  // Request metrics
  requestsTotal: Counter;
  requestDuration: Histogram;
  requestErrors: Counter;
  
  // Tool-specific metrics
  toolInvocations: Counter; // labeled by tool name
  toolDuration: Histogram;  // labeled by tool name
  toolErrors: Counter;      // labeled by tool name, error type
}

Prometheus/Grafana integration example:

import { register, Counter, Histogram } from 'prom-client';

const requestDuration = new Histogram({
  name: 'mcp_request_duration_seconds',
  help: 'Duration of MCP requests',
  labelNames: ['method', 'status'],
  buckets: [0.01, 0.05, 0.1, 0.5, 1, 2, 5]
});

// In request handler:
const end = requestDuration.startTimer();
try {
  const result = await handleRequest(request);
  end({ method: request.method, status: 'success' });
  return result;
} catch (error) {
  end({ method: request.method, status: 'error' });
  throw error;
}

Load Balancing Strategies #

For HTTP transport, standard load balancing applies with MCP-specific considerations:

Sticky sessions vs. stateless:

Approach When to Use Implementation
Sticky sessions Stateful servers that track session data Cookie or IP-based routing
Stateless Servers that store no session state Round-robin, least-connections
Session affinity Mixed state (some tools stateful, others not) Route by tool category

Nginx upstream configuration:

upstream mcp_servers {
    least_conn;
    server mcp-server-1:3000 max_fails=3 fail_timeout=30s;
    server mcp-server-2:3000 max_fails=3 fail_timeout=30s;
    server mcp-server-3:3000 max_fails=3 fail_timeout=30s;
    
    keepalive 32;
    keepalive_requests 1000;
    keepalive_timeout 60s;
}

server {
    location /mcp {
        proxy_pass http://mcp_servers;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        
        # Timeouts
        proxy_connect_timeout 5s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }
}

Graceful Shutdown Patterns #

Production servers must handle shutdown signals gracefully, completing in-flight requests before exiting:

class GracefulServer {
  private shuttingDown = false;
  private activeRequests = 0;
  
  async start(): Promise<void> {
    // Set up signal handlers
    process.on('SIGTERM', () => this.shutdown('SIGTERM'));
    process.on('SIGINT', () => this.shutdown('SIGINT'));
    
    // Start server
    await this.server.listen(3000);
  }
  
  private async shutdown(signal: string): Promise<void> {
    console.log(`Received ${signal}, starting graceful shutdown...`);
    this.shuttingDown = true;
    
    // Stop accepting new connections
    this.server.close();
    
    // Wait for active requests to complete (with timeout)
    const timeout = setTimeout(() => {
      console.error('Shutdown timeout exceeded, forcing exit');
      process.exit(1);
    }, 30000);
    
    while (this.activeRequests > 0) {
      console.log(`Waiting for ${this.activeRequests} active requests...`);
      await sleep(1000);
    }
    
    clearTimeout(timeout);
    console.log('Graceful shutdown complete');
    process.exit(0);
  }
  
  private async handleRequest(req: Request, res: Response): Promise<void> {
    if (this.shuttingDown) {
      res.status(503).json({ error: 'Server shutting down' });
      return;
    }
    
    this.activeRequests++;
    try {
      await this.processRequest(req, res);
    } finally {
      this.activeRequests--;
    }
  }
}

Kubernetes Deployment Pattern #

Full production-ready Kubernetes manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mcp-customer-api
  labels:
    app: mcp-customer-api
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: mcp-customer-api
  template:
    metadata:
      labels:
        app: mcp-customer-api
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "3000"
        prometheus.io/path: "/metrics"
    spec:
      containers:
      - name: server
        image: registry/mcp-customer-api:v1.2.0
        ports:
        - containerPort: 3000
          name: http
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: mcp-secrets
              key: database-url
        - name: MCP_API_KEY
          valueFrom:
            secretKeyRef:
              name: mcp-secrets
              key: api-key
        livenessProbe:
          httpGet:
            path: /health/live
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health/ready
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 5
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 15"]
      terminationGracePeriodSeconds: 60
---
apiVersion: v1
kind: Service
metadata:
  name: mcp-customer-api
spec:
  selector:
    app: mcp-customer-api
  ports:
  - port: 80
    targetPort: 3000
  type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: mcp-customer-api
spec:
  podSelector:
    matchLabels:
      app: mcp-customer-api
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: gateway
    ports:
    - protocol: TCP
      port: 3000
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: database
    ports:
    - protocol: TCP
      port: 5432

Scaling Considerations #

Horizontal vs. vertical scaling for MCP:

Factor Horizontal Scaling Vertical Scaling
Tool execution Distribute load across instances More CPU/memory per heavy operation
Stateful connections Requires session affinity Single instance handles all state
Tool diversity Shard by tool category All tools on all instances
Cost efficiency Better for bursty traffic Better for steady-state
Complexity Load balancing, session mgmt Simpler deployment

My recommendation: Start vertical, add horizontal when:

  • Single instance CPU > 70% sustained
  • Memory pressure from concurrent operations
  • Need multi-region deployment
  • Different teams need isolated capacity

From Production Experience #

The failures I've seen at scale aren't in the MCP protocol—they're in the surrounding infrastructure: connection leaks, missing health checks, unbounded request queues, secrets not rotating. The protocol is stable; treat the operational layers with the same rigor you apply to any production service.

The patterns above (pools, health checks, graceful shutdown, observability) are table stakes for production MCP. Skip them and you'll learn why the hard way.

The Future of MCP: Standardization and Ecosystem Growth #

MCP's trajectory points toward universal adoption as the interoperability layer for AI agents, with the Linux Foundation's Agentic AI Foundation governing its evolution toward a true industry standard. Understanding this roadmap helps you place strategic bets on your architecture.

Linux Foundation Governance #

Anthropic donated MCP to the Agentic AI Foundation under the Linux Foundation in December 2025. This move mirrors the governance model that made Kubernetes, GraphQL, and Node.js successful—vendor-neutral stewardship with multi-party contribution.

Governance implications:

Aspect Before (Anthropic-controlled) After (Linux Foundation)
Specification Anthropic published Open RFC process with working groups
SDKs Official TypeScript/Python Multi-language, community-contributed
Roadmap Anthropic internal Public, community-driven
Contribution CLA required Standard open-source contribution model
Certification None Potential compliance/certification programs

What this means for adopters:

  • Protocol stability guarantees (breaking changes require committee approval)
  • Multi-vendor support commitment (not just Anthropic/Claude)
  • Long-term maintenance assurance (foundation backing)
  • Legal clarity (standard open-source licensing)

Expanding Client Ecosystem #

MCP client support is broadening beyond Anthropic's applications to become a standard capability in AI tools:

Client MCP Support Status Transport Notes
Claude Desktop Native stdio Reference implementation
Claude Code Native stdio Full server support
Cursor Native stdio Settings UI for server management
Continue Native stdio Open-source IDE extension
Zed In development stdio Rust-based editor
Cody (Sourcegraph) Planned HTTP Enterprise code AI
Custom apps Via SDKs All TypeScript/Python SDKs available

The tipping point: When 3+ major AI clients support MCP natively, server developers achieve true "write once, run anywhere" capability. We're approaching that threshold in mid-2026.

Protocol Evolution Roadmap #

The MCP specification continues evolving through working group proposals:

Released (2024-2025):

  • Core protocol (tools, resources, prompts)
  • stdio and HTTP transports
  • Sampling and roots capabilities
  • TypeScript and Python SDKs

In development (2026):

  • Streaming responses: Native support for tool output streaming (partial results, progress updates)
  • Batch operations: Multi-tool invocation in single request
  • Enhanced auth: OAuth 2.0 and OIDC integration patterns
  • Resource templating: Dynamic resource URI patterns with parameters

Proposed (2026-2027):

  • Server discovery: DNS-SD/mDNS for local server auto-discovery
  • Federation: Server-to-server delegation and capability proxying
  • Workflow primitives: Built-in support for multi-step tool orchestration
  • Evaluation framework: Standardized agent capability testing

Experimental:

  • Multi-modal content: Native image/audio/video in tool responses
  • Persistent sessions: Server-side session resumption across reconnections
  • Capability negotiation 2.0: Fine-grained permission grants per-tool

Strategic Implications for Builders #

Bet on MCP when:

  1. Building tool/platform integrations: If you're exposing an API that AI agents should use, MCP is becoming the standard interface. Building REST + GraphQL + MCP is redundant—MCP covers the AI use case.

  2. Creating AI agent platforms: Your platform needs to speak MCP to participate in the ecosystem. Clients without MCP support will face integration friction.

  3. Standardizing internal AI tools: Enterprises with 50+ internal tools should unify behind MCP rather than maintaining custom integration code per tool.

  4. Building SaaS products: MCP-enabled products can be discovered and used by any MCP-compatible client—distribution advantage similar to being in an app store.

Hold off when:

  1. Simple prototypes with 2-3 tools: Traditional function calling is faster for MVPs.

  2. Deeply embedded single-LLM systems: If you're all-in on OpenAI with no multi-client plans, their native tool calling may suffice.

  3. Real-time latency-critical systems: MCP's JSON-RPC overhead (vs. binary protocols) matters for sub-10ms requirements.

Ecosystem Growth Metrics #

Tracking MCP adoption through observable signals:

Metric Mid-2024 Early 2026 Trend
Official servers 5 15+ Growing
Community servers ~20 200+ 10x growth
SDK languages 2 4 (TS, Python, Go, Rust) Expanding
Client implementations 2 6+ Accelerating
GitHub repos tagged MCP ~100 1,500+ Viral growth
n8n MCP workflows ~50 500+ Enterprise adoption

The curve is exponential. Early 2026 marks the transition from "early adopter" to "early majority" in the technology adoption lifecycle.

The Multi-Agent Future #

MCP's architecture enables a future where specialized AI agents collaborate through standardized tool interfaces:

┌─────────────────────────────────────────────────────────┐
│              ORCHESTRATOR AGENT                          │
│  "Coordinate the customer onboarding workflow"           │
└────────────┬──────────────┬──────────────┬─────────────┘
             │              │              │
     ┌───────▼─────┐ ┌──────▼──────┐ ┌──────▼──────┐
     │ CRM Agent   │ │ Email Agent │ │ Billing     │
     │ (MCP client)│ │ (MCP client)│ │ Agent       │
     └──────┬──────┘ └──────┬──────┘ └──────┬──────┘
            │               │               │
     ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
     │ CRM Server  │ │ SendGrid    │ │ Stripe      │
     │             │ │ Server      │ │ Server      │
     └─────────────┘ └─────────────┘ └─────────────┘

Each agent specializes:

  • CRM Agent understands customer data models
  • Email Agent handles deliverability and templates
  • Billing Agent manages payment flows and invoicing

The orchestrator doesn't need to understand Stripe's API—it delegates to the Billing Agent via MCP. This composability is MCP's long-term promise: not just tool access, but agent collaboration.

The Standardization Bet #

I'm betting on MCP because protocols win. The history of technology is littered with superior proprietary solutions that lost to good-enough open standards:

  • REST/HTTP beat CORBA, SOAP, and countless enterprise protocols
  • SQL beat dozens of database query languages
  • TCP/IP beat OSI and proprietary networking stacks

MCP is positioned to be the REST of AI agent integration—the good-enough, widely-adopted standard that becomes the default choice by virtue of ubiquity.

The Linux Foundation governance cements this trajectory. Anthropic surrendered control to accelerate adoption—a classic "commoditize your complement" strategy that benefits the entire ecosystem.

Preparing for the Future #

Action items for MCP adopters:

  1. Audit your integrations: Map current custom tool code to potential MCP servers. What's the migration path?

  2. Experiment with official servers: Get hands-on with filesystem, SQLite, and Brave servers to understand the developer experience.

  3. Build one custom server: Nothing teaches MCP like implementing the protocol. Start with a simple internal tool.

  4. Monitor the specification: Follow the Agentic AI Foundation working groups for roadmap visibility.

  5. Contribute to the ecosystem: Open-source your servers, report SDK bugs, write tutorials—protocols thrive on community investment.

The builders who understand MCP deeply in 2026 will have architectural advantages as the ecosystem matures. This isn't just about tooling—it's about positioning for the multi-agent, protocol-driven future of AI systems.

FAQ: Model Context Protocol #

What is Model Context Protocol (MCP)? #

MCP is an open standard protocol that enables AI systems to access external tools, data sources, and workflows through a unified client-server architecture. Anthropic introduced MCP in November 2024 and donated it to the Linux Foundation's Agentic AI Foundation in December 2025. It uses JSON-RPC 2.0 for message exchange and supports stdio, HTTP, and WebSocket transports.

How does MCP differ from traditional REST APIs? #

MCP uses stateful bidirectional sessions with runtime capability negotiation, while REST uses stateless request-response with static endpoint definitions. Key differences: MCP servers declare capabilities during initialization (tools, resources, prompts), support bidirectional messaging (servers can request LLM sampling), and maintain persistent connections. REST requires clients to know endpoints in advance and cannot easily support server-initiated requests or dynamic capability discovery.

What are the three types of MCP server capabilities? #

The three MCP primitives are Tools (executable functions), Resources (addressable data), and Prompts (templated workflows). Tools let agents take actions like querying databases or sending emails. Resources provide read-only data access with URI addressing like file:///path or db://table/row. Prompts package reusable instruction templates for common tasks like "analyze this code for security issues."

Is MCP only for Claude, or does it work with other LLMs? #

MCP is an open standard that works with any AI client implementing the protocol. Claude Desktop, Claude Code, and Cursor support MCP natively as of early 2026. The Linux Foundation governance ensures multi-client support is a priority—OpenAI, Google, and other providers can adopt MCP without licensing concerns. Custom clients can use the TypeScript or Python SDKs to add MCP support.

What transport protocols does MCP support? #

MCP supports three transports: stdio (standard input/output) for local processes, HTTP for remote services, and WebSocket for real-time bidirectional streaming. stdio is the default for local development with Claude Desktop and Cursor. HTTP enables SaaS-style deployments with load balancing and authentication. WebSocket provides lower-latency streaming for high-frequency operations.

How do I build my own MCP server? #

Build an MCP server by implementing the JSON-RPC protocol using the official TypeScript or Python SDKs. The minimal implementation requires: capability declaration during initialization, request handlers for tools/list and tools/call, and transport setup (stdio or HTTP). The SDKs handle protocol boilerplate—you implement business logic in handlers. Test with the MCP Inspector (npx @modelcontextprotocol/inspector) before deploying.

What is the JSON-RPC 2.0 message format in MCP? #

MCP messages follow JSON-RPC 2.0 with four types: Requests (method + params + id), Success Responses (result + id), Error Responses (error code + message + id), and Notifications (method + params, no id). Example request: {"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"search","arguments":{"query":"test"}}}. The protocol uses standard error codes (-32700 to -32099) for protocol-level issues and returns business logic errors in the result with isError: true.

How does MCP handle security and permissions? #

MCP uses capability-based security with host-mediated user consent—servers declare what they offer, clients declare what they can provide back, and users approve the intersection. Key security features: 1:1 connection isolation (each server gets its own client), OS-level process boundaries for stdio transport, TLS + authentication for HTTP transport, and granular permission controls (per-tool, per-resource allowlisting). Production deployments should implement API key auth, rate limiting, and input validation.

Can MCP servers call back into the client for LLM sampling? #

Yes, the sampling capability lets MCP servers request LLM completions from the host through sampling/createMessage. This bidirectional pattern enables servers to delegate cognitive work—like summarizing search results or analyzing error messages—without embedding their own LLM API keys. Servers declare sampling need during initialization; clients that support it can provide completions from their configured models (Claude, GPT, Gemini, etc.).

What is the MCP ecosystem and server registry? #

The MCP ecosystem includes 200+ servers ranging from official Anthropic implementations to community-built integrations. Official servers include filesystem, SQLite, Brave Search, GitHub, PostgreSQL, and Slack connectors. The community registry covers databases (MongoDB, Redis), business apps (Salesforce, Stripe), developer tools (Docker, Kubernetes), and AI infrastructure (vector databases, LLM proxies). Access via npm (npx @modelcontextprotocol/server-*) or the mcp.run registry.

How does MCP compare to OpenAI's function calling? #

MCP replaces static function definitions with dynamic server-hosted capabilities. OpenAI function calling embeds tool schemas in API requests with stateless execution—clients receive tool calls and execute locally. MCP uses stateful sessions where servers host tools remotely, enabling runtime capability changes, resource subscriptions, and bidirectional sampling. MCP supports unlimited tools (servers paginate), while OpenAI's function calling is limited by token budget constraints (~128 functions).

Is MCP production-ready for enterprise use? #

Yes, with proper operational patterns: connection pooling, health checks, graceful shutdown, and security hardening. The protocol itself is stable (version 2024-11-05, governed by Linux Foundation). Production deployments should use HTTP transport with TLS 1.3, implement authentication (API keys or OAuth), add rate limiting and request timeouts, and monitor with Prometheus/Grafana metrics. The official SDKs include enterprise features like structured logging and graceful error handling.

Next Steps #

MCP represents the standardization of AI agent integration—the moment when "connecting AI to tools" shifts from custom engineering to protocol-based configuration. The builders who master this protocol now will architect the multi-agent systems of the next decade.

If you're evaluating AI automation for your team or building agent-powered products, the MCP decision is foundational. Protocol choices made today constrain or enable capability for years. I've walked teams through this evaluation for CRM integrations, n8n orchestration layers, and custom agent platforms—each time, MCP's bidirectional design and growing ecosystem proved decisive.

Book an AI automation strategy call to discuss your specific architecture. We'll map your tool requirements to MCP server patterns, evaluate transport and security models for your deployment context, and build a migration path from existing integrations.

Resources #

0 views • 0 likes