Name: MCPWatch
Author: MCPWatch

MCP servers sit at the boundary between AI clients and the tools they depend on, which makes them a frequent source of hard-to-diagnose failures. When something breaks, you often get a cryptic error from the client side with no visibility into what actually happened on the server. This article covers the five failure modes we see most often, with concrete debugging steps and the observability patterns that make each one easier to catch.

Connection timeouts and transport failures

MCP supports three transports: stdio (for local processes), Streamable HTTP (the current standard for remote servers), and SSE (Server-Sent Events, the legacy remote transport). All three can fail in ways that surface as vague connection errors on the client side.

What it looks like

With stdio transport, you typically see the MCP client report that the server process exited unexpectedly or stopped responding:

Error: MCP server process exited with code 1
  at StdioTransport.onClose (transport.js:142)

With HTTP-based transports (Streamable HTTP or SSE), failures usually manifest as connection timeouts or dropped streams:

Error: SSE connection to mcp-server timed out after 30000ms
  at SSETransport.connect (sse-transport.js:87)

Common causes

Process crashes: The MCP server process throws an unhandled exception and exits. This is the most common stdio failure — the client sees the process disappear with no error message forwarded.
Buffer overflows: Large tool responses can overflow stdio pipe buffers, causing the process to hang or crash. This is especially common with tools that return base64-encoded files or large datasets.
Network interruptions: For HTTP-based transports (Streamable HTTP and SSE), transient network issues, proxy timeouts, or load balancer idle-connection limits can silently drop the connection.
Startup failures: The server fails during initialization (bad config, missing environment variables, port conflicts) before it even accepts connections.

How to debug

Start by checking whether the server process is still running:

# For stdio-based servers, check if the process is alive
ps aux | grep my-mcp-server

# Check stderr output — MCP servers often log errors there
# If you launched via a client, check the client's logs for captured stderr

For HTTP-based transports, verify basic connectivity:

# Test a Streamable HTTP endpoint
curl -X POST http://localhost:3001/mcp -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"initialize","id":1,"params":{}}'

# Test a legacy SSE endpoint
curl -N -H "Accept: text/event-stream" http://localhost:3001/sse

# Check if the port is even open
lsof -i :3001

If the server is crashing during tool execution, add a top-level error handler to catch unhandled rejections:

process.on('uncaughtException', (err) => {
  console.error('[MCP Server] Uncaught exception:', err);
});

process.on('unhandledRejection', (reason) => {
  console.error('[MCP Server] Unhandled rejection:', reason);
});

How MCPWatch helps

MCPWatch records each MCP operation as an event with start/end timestamps, duration, and a status. When a transport failure occurs mid-request, you can see which tool call was in progress and how long it had been running. MCPWatch also records server lifecycle events — initialize and close — so you can correlate transport failures with server restarts. The SDK auto-detects your transport type (stdio, Streamable HTTP, or SSE) and includes it in every event for filtering.

Tool not found errors

Tool-not-found errors happen when a client tries to invoke a tool name that the server does not recognize. This is one of the most common errors during development and after deployments.

What it looks like

The MCP protocol returns a standard error response when a tool name does not match any registered handler:

{
  "jsonrpc": "2.0",
  "id": 1,
  "error": {
    "code": -32601,
    "message": "Tool not found: search_documents",
    "data": {
      "availableTools": ["searchDocuments", "getDocument", "listCollections"]
    }
  }
}

Notice the mismatch: the client called search_documents (snake_case) but the server registered searchDocuments (camelCase).

Common causes

Naming convention mismatches: Snake_case vs. camelCase vs. kebab-case between client configuration and server registration.
Version skew: The server was updated and a tool was renamed or removed, but the client configuration still references the old name.
Conditional registration: Tools that are only registered under certain conditions (feature flags, environment checks) may be missing when you expect them to be available.
Typos in client config: Manual tool name entries in claude_desktop_config.json or similar client configuration files.

How to debug

First, verify what tools the server actually exposes by calling the tools/list method:

// If you have access to the MCP client, list available tools
const response = await client.request(
  { method: 'tools/list' },
  ListToolsResultSchema
);
console.log('Available tools:', response.tools.map(t => t.name));

Then compare that list against your client configuration. Pay close attention to casing and separators.

If you suspect version skew, check the server’s tool registration code directly:

// server.ts — verify the exact name string
server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [
    {
      name: 'searchDocuments', // <-- This is the canonical name
      description: 'Search the document index',
      inputSchema: { /* ... */ },
    },
  ],
}));

How MCPWatch helps

In the MCPWatch dashboard, errors are automatically grouped by type, message, and originating server. Tool-not-found errors cluster together, making it immediately visible which tool names are failing and how often. The error detail view shows the requested tool name alongside the server’s registered tool list at the time of the request, so you can spot naming mismatches without manually querying the server. If you have multiple MCP servers, MCPWatch’s cross-server error grouping reveals whether the same tool name is failing across different servers — a strong signal of a client-side config issue rather than a server bug.

Schema validation failures

Every MCP tool declares a JSON Schema for its input parameters and, optionally, its output shape. When the actual data does not conform to the declared schema, you get a validation error.

What it looks like

Schema validation errors return structured information about what failed:

{
  "jsonrpc": "2.0",
  "id": 3,
  "error": {
    "code": -32602,
    "message": "Invalid params: /query must be string, got number",
    "data": {
      "tool": "searchDocuments",
      "validationErrors": [
        {
          "path": "/query",
          "expected": "string",
          "received": "number",
          "value": 42
        }
      ]
    }
  }
}

Common causes

Type coercion issues: AI clients sometimes pass numbers where strings are expected, or vice versa. The integer 42 vs. the string "42" is a classic example.
Missing required fields: The client omits a required parameter. This happens frequently when AI models generate tool calls and skip optional-looking parameters that are actually required.
Extra fields with additionalProperties: false: The client sends fields the schema does not declare, and strict validation rejects the entire request.
Output validation failures: Less common but harder to debug — the tool executes successfully but the return value does not match the declared output schema.

How to debug

Start by comparing the actual payload against the tool’s declared schema:

// Get the tool's schema
const tools = await client.request(
  { method: 'tools/list' },
  ListToolsResultSchema
);

const searchTool = tools.tools.find(t => t.name === 'searchDocuments');
console.log('Input schema:', JSON.stringify(searchTool.inputSchema, null, 2));

// Compare against the actual arguments being sent
console.log('Arguments sent:', JSON.stringify(request.params.arguments, null, 2));

For output validation failures, add explicit validation on the server side before returning:

import Ajv from 'ajv';

const ajv = new Ajv();
const validate = ajv.compile(outputSchema);

const result = await executeSearch(query);
if (!validate(result)) {
  console.error('Output validation errors:', validate.errors);
  // Log the actual result shape for debugging
  console.error('Actual output:', JSON.stringify(result));
}

How MCPWatch helps

MCPWatch captures the full request and response payload for every tool invocation. When a schema validation error occurs, you can open the trace detail and see exactly what was sent (the raw arguments object) alongside the tool’s declared schema. This eliminates the guessing game of “what did the client actually pass?” — you have the literal JSON. For output validation issues, MCPWatch captures the server’s response body, so you can compare the actual output against the declared schema directly in the dashboard without needing to reproduce the call.

Resource permission errors

MCP servers expose resources — files, database records, API endpoints — that clients can read. When access controls are misconfigured, clients hit permission errors that can be confusing to diagnose.

What it looks like

Permission errors typically surface as error responses on resource read attempts:

{
  "jsonrpc": "2.0",
  "id": 5,
  "error": {
    "code": -32603,
    "message": "Access denied: insufficient permissions for resource",
    "data": {
      "uri": "resource:///documents/internal/roadmap.md",
      "requiredScope": "documents:read:internal",
      "currentScopes": ["documents:read:public"]
    }
  }
}

Common causes

Missing or expired API keys: The MCP server proxies requests to an upstream API, and the credentials have expired or lack the required scopes.
Resource URI path traversal: A client constructs a resource URI that references a path outside the allowed directory or namespace.
Environment mismatch: Credentials that work in development (with broad permissions) fail in staging or production where scopes are locked down.
Scope escalation via tool chaining: An AI client chains multiple tool calls where an early call returns resource URIs that the client does not have permission to access in subsequent calls.

How to debug

Start by verifying what the server is actually checking. Add logging around your authorization logic:

server.setRequestHandler(ReadResourceRequestSchema, async (request) => {
  const uri = request.params.uri;
  const scopes = getClientScopes(request); // however your auth works

  console.log(`[Auth] Resource read attempt:
    URI: ${uri}
    Client scopes: ${JSON.stringify(scopes)}
    Required: ${getRequiredScope(uri)}`);

  if (!hasPermission(scopes, uri)) {
    throw new McpError(
      ErrorCode.InvalidRequest,
      `Access denied for resource: ${uri}`
    );
  }

  return readResource(uri);
});

Check your resource URI patterns. A common issue is that resource templates do not properly validate URI components:

// Vulnerable: allows path traversal
const filePath = `./data/${uri.replace('resource:///', '')}`;

// Safer: validate and normalize the path
const safePath = path.resolve('./data', path.normalize(resourceName));
if (!safePath.startsWith(path.resolve('./data'))) {
  throw new McpError(ErrorCode.InvalidRequest, 'Invalid resource path');
}

How MCPWatch helps

MCPWatch traces show the full request chain, including nested operations within a single MCP call. When a resource permission error occurs, you can see the authorization check as a distinct span in the trace waterfall — including what scopes were evaluated, what the decision was, and how long the auth check took. This is especially valuable for scope escalation bugs: the trace shows the sequence of tool calls that led to the unauthorized resource access, so you can see exactly which earlier call returned the problematic URI. MCPWatch alerts can also be configured to fire on spikes in permission errors, which is a strong signal of either a misconfiguration rollout or an unauthorized access attempt.

Silent failures with no error response

This is the hardest category of failure to diagnose. The MCP server returns a successful response — status code 200, no error field — but the result is empty, incomplete, or wrong. No error is thrown, no alert fires, and the AI client continues with bad data.

What it looks like

Silent failures do not produce error messages. Instead, they look like normal responses with suspicious content:

{
  "jsonrpc": "2.0",
  "id": 7,
  "result": {
    "content": [
      {
        "type": "text",
        "text": ""
      }
    ]
  }
}

The response is structurally valid — it has the right shape, the right fields, the right types. But the text field is empty. The AI client receives this, treats it as a successful tool call, and proceeds with no data. No error is logged anywhere.

Common causes

Swallowed exceptions: A try/catch block catches an error and returns a default empty value instead of propagating the failure.

// This pattern silently hides failures
async function searchDocuments(query: string) {
  try {
    return await db.search(query);
  } catch (error) {
    return []; // Caller has no idea this failed
  }
}

Null upstream responses: An external API returns null or undefined, and the server passes it through without validation.
Partial data from pagination: A tool fetches only the first page of results but does not indicate that more pages exist, returning a truncated dataset.
Race conditions: An async operation completes before its data dependency is ready, returning stale or empty state.

How to debug

Add assertions on your tool responses before returning them:

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const result = await handleToolCall(request);

  // Assert the response is not empty
  if (!result || (Array.isArray(result) && result.length === 0)) {
    console.warn(
      `[Warning] Tool "${request.params.name}" returned empty result`,
      { args: request.params.arguments }
    );
  }

  // Assert the response shape matches expectations
  if (result.content?.[0]?.text === '') {
    console.warn(
      `[Warning] Tool "${request.params.name}" returned empty text content`
    );
  }

  return result;
});

Replace silent catch blocks with explicit error responses:

async function searchDocuments(query: string) {
  try {
    return await db.search(query);
  } catch (error) {
    // Don't swallow — return an error the client can see
    throw new McpError(
      ErrorCode.InternalError,
      `Search failed: ${error.message}`
    );
  }
}

How MCPWatch helps

MCPWatch analytics track response characteristics over time — including response payload size, content length, and result counts. When a tool that normally returns 500-byte responses suddenly starts returning 0-byte responses, that pattern is visible in MCPWatch’s throughput and response size charts even though no errors are being reported. You can set up alerts on anomalous response patterns: a sudden drop in average response size or a spike in empty responses triggers a notification before users even report an issue. The trace detail view also captures the full response body, so you can inspect what the server actually returned and compare it against historical responses for the same tool. This turns “it seems like the tool stopped working” into “the tool started returning empty results at 3:47 PM, correlated with a deployment at 3:45 PM.”

Wrapping up

MCP server failures range from obvious (connection timeouts with stack traces) to nearly invisible (silently empty responses). The common thread is that debugging any of them requires visibility into what happened inside the server at the time of the failure — the raw request, the server’s internal operations, and the exact response that was sent back.

Adding observability early, before you hit these issues in production, transforms debugging from “reproduce the problem locally and add console.log statements” to “open the trace and look at what happened.” Whether you use MCPWatch or build your own instrumentation, the investment pays off the first time you catch a silent failure that would have otherwise gone unnoticed.