Step-by-Step: How to Expose a Skill as an MCP Server
Step-by-step: annotate a Python function, test with MCP Inspector, containerize it, and register in Claude Desktop, Cursor, and VS Code.
Abstract Algorithms
TLDR: Turn any Python function into a multi-client MCP server in 11 steps β from annotation to Docker.
π The Copy-Paste Problem: Why Skills Die at IDE Boundaries
A developer pastes their summarize_pr_diff function into a Slack message because their teammate uses Cursor and can't call a Copilot skill. The function works perfectly. The sharing mechanism is broken. By the end of this post, that same function runs as an MCP server β callable from Cursor, Claude Desktop, and VS Code Copilot simultaneously, with no copy-paste required.
If you have ever written an LLM-powered function that worked exactly as intended in one tool and then had to manually explain it, copy-paste it, or rewrite it for a colleague on a different IDE, you already understand the problem this post solves. The tool is not broken. The distribution model is.
The Model Context Protocol (MCP) is an open standard that defines a single wire format for exposing Python functions as callable tools to any MCP-aware AI client. Once your function is wrapped as an MCP server, it becomes simultaneously available to Cursor, Claude Desktop, GitHub Copilot in VS Code agent mode, and any other compliant client β without any per-client rewriting.
This post is the practical companion to Headless Agents: How to Deploy Your Skills as an MCP Server. That post explains the why and what of MCP: the three-layer architecture, the stdio vs. HTTP transport decision guide, and the conceptual model for headless skill deployment. This post covers the how in full numbered detail β every command, every config file, every failure mode β starting from a single Python function and ending with a server visible across three clients.
π Before You Start: What You Need and How MCP Registration Works
Before running a single command, it helps to understand what the end state looks like so each step has a clear purpose.
What you need installed:
- Python 3.11+ with
pip - Docker Desktop (for Step 7)
- Claude Desktop, Cursor, or VS Code with GitHub Copilot (at least one to test registration)
curlfor HTTP transport testing
What "registered" means in practice: Each MCP client maintains a config file β a JSON file in a platform-specific directory β that maps a server name to either a command to spawn (for stdio transport) or a URL to connect to (for HTTP+SSE transport). When you open the client, it reads this config, attempts to start or connect to every listed server, and calls tools/list to discover available tools. If your server is running and returns a valid schema, the tools appear in the client's tool picker within seconds. If anything in the chain fails silently, the tools simply don't appear β no error dialog, no log by default. That silence is why Step 4 (MCP Inspector) and Step 11 (debugging) are so important.
The two transports at a glance:
| Transport | How the client connects | Ideal for |
| stdio | Spawns your script as a child process | Local dev, single developer, same machine |
| HTTP + SSE | Connects to a running HTTP server | Shared team use, Docker, cloud deployment |
For a full transport decision guide, see Headless Agents. This post shows you how to implement both and choose at registration time.
βοΈ Steps 1β4: From Python Function to Locally Tested MCP Server
These four steps take you from an empty directory to a fully verified local MCP server. Every subsequent step builds on this foundation.
Step 1 β Set Up the Project Structure
Create the directory layout and install dependencies:
mkdir mcp-pr-summarizer && cd mcp-pr-summarizer
python -m venv .venv && source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install "mcp>=1.0" fastmcp openai
Your directory should look like this:
mcp-pr-summarizer/
βββ server.py
βββ pyproject.toml
βββ Dockerfile
The pyproject.toml declares the package and its runtime dependencies:
# pyproject.toml
[project]
name = "pr-summarizer-mcp"
version = "0.1.0"
requires-python = ">=3.11"
dependencies = ["mcp>=1.0", "fastmcp", "openai"]
FastMCP vs. bare mcp.server: FastMCP is a thin decorator layer over the official MCP Python SDK. With FastMCP, you write @app.tool() and the schema is inferred from your function signature. With the bare SDK, you write @server.list_tools() and @server.call_tool() separately and provide the JSON Schema manually. This post uses the bare SDK for Steps 2β3 so you see exactly what the wire format looks like, then shows the FastMCP shorthand for reference. See the previous post for a full FastMCP conceptual overview.
Step 2 β Write the Tool Function with @server.tool() Annotation
The Server class from the MCP SDK is the core registry. You declare tools using two decorators: @server.list_tools() for the capability announcement and @server.call_tool() for the dispatcher.
# server.py
import asyncio
import os
from mcp.server import Server
from mcp.server.stdio import stdio_server
import mcp.types as types
from openai import AsyncOpenAI
server = Server("pr-summarizer")
client = AsyncOpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
@server.list_tools()
async def list_tools() -> list[types.Tool]:
return [
types.Tool(
name="summarize_pr_diff",
description=(
"Summarize a GitHub PR diff into a human-readable description. "
"Returns a structured summary with an overview, list of key changes, "
"and suggested testing notes."
),
inputSchema={
"type": "object",
"properties": {
"diff": {
"type": "string",
"description": "The raw git diff content from the PR"
},
"target_audience": {
"type": "string",
"description": "Who will read this summary",
"default": "engineering team"
}
},
"required": ["diff"]
}
)
]
@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[types.TextContent]:
if name == "summarize_pr_diff":
return await _summarize_pr_diff(
diff=arguments["diff"],
target_audience=arguments.get("target_audience", "engineering team")
)
raise ValueError(f"Unknown tool: {name}")
async def _summarize_pr_diff(diff: str, target_audience: str) -> list[types.TextContent]:
response = await client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": f"You write PR descriptions for a {target_audience}."},
{"role": "user", "content": f"Summarize this diff:\n\n{diff}"}
]
)
return [types.TextContent(type="text", text=response.choices[0].message.content)]
async def main():
async with stdio_server() as streams:
await server.run(*streams, server.create_initialization_options())
if __name__ == "__main__":
asyncio.run(main())
FastMCP shorthand (for comparison): Using FastMCP, the same tool looks like
@app.tool() async def summarize_pr_diff(diff: str, target_audience: str = "engineering team") -> str: ...β the schema is inferred automatically from type hints, and there is no separatelist_tools/call_toolsplit. Both approaches produce identical wire output; the bare SDK makes the schema structure explicit, which is useful when you need fine-grained control over descriptions and defaults.
Step 3 β Add Input Validation and Error Handling
Raw exceptions must never propagate out of an MCP handler. Clients interpret unhandled exceptions as protocol errors and may silently drop the tool from their registry. Always raise McpError with a structured ErrorData payload.
# server.py (updated call_tool and helper)
from mcp.shared.exceptions import McpError
from mcp.types import ErrorData, INTERNAL_ERROR, INVALID_PARAMS
@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[types.TextContent]:
if name == "summarize_pr_diff":
return await _summarize_pr_diff(
diff=arguments.get("diff", ""),
target_audience=arguments.get("target_audience", "engineering team")
)
raise McpError(ErrorData(code=INVALID_PARAMS, message=f"Unknown tool: {name}"))
async def _summarize_pr_diff(diff: str, target_audience: str) -> list[types.TextContent]:
if not diff.strip():
raise McpError(ErrorData(code=INVALID_PARAMS, message="diff cannot be empty"))
if len(diff) > 100_000:
raise McpError(ErrorData(code=INVALID_PARAMS, message="diff exceeds 100 KB limit"))
try:
response = await client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": f"You write PR descriptions for a {target_audience}."},
{"role": "user", "content": f"Summarize this diff:\n\n{diff}"}
]
)
return [types.TextContent(type="text", text=response.choices[0].message.content)]
except Exception as exc:
raise McpError(ErrorData(code=INTERNAL_ERROR, message=f"LLM call failed: {exc}")) from exc
The pattern here is deliberate: validate inputs first, wrap the LLM call in a try/except, and always raise McpError β never a bare ValueError or RuntimeError. The error code constants (INVALID_PARAMS, INTERNAL_ERROR) are standard JSON-RPC 2.0 error codes that clients know how to surface cleanly.
Step 4 β Test Locally with MCP Inspector
The MCP Inspector is a browser-based debugging UI that connects directly to your server over stdio. It shows your tool schema, lets you send test calls, and displays raw request/response JSON.
# Install the MCP CLI tools if not already present
pip install "mcp[cli]"
# Launch Inspector against your server
mcp dev server.py
The command spawns your server as a child process, then opens http://localhost:5173 in your browser. You will see:
- Tools tab β your
summarize_pr_difftool listed with its full input schema - Call panel β a form pre-populated from the schema; fill in
diffand click Run - Messages tab β raw JSON-RPC traffic between Inspector and your server
If the tool does not appear in the Tools tab, the schema is malformed. The most common cause is a missing "type": "object" at the top level of inputSchema β covered in the Deep Dive section next.
π§ Deep Dive: Tool Schema Design and Why It Breaks Half the Integrations
The MCP Inspector passing your test call does not guarantee every client will work. Claude Desktop, Cursor, and VS Code each have slightly different schema validation behavior. Understanding what the client reads β and what it ignores β prevents silent failures in production.
The Internals: How MCP Clients Read Your Tool Schema
When a client calls tools/list, your server returns a JSON array of tool descriptors. Each descriptor has three fields: name, description, and inputSchema. The inputSchema field is a standard JSON Schema object. Clients use it to:
- Generate the invocation form in the UI (Claude Desktop renders a form from the schema)
- Validate arguments before sending them to your server
- Choose the tool β the LLM reads
descriptionwhen deciding which tool to invoke for a user's intent
The four fields that matter most, in order of impact:
| Field | Where it matters | What breaks without it |
inputSchema.type: "object" | All clients | Tool is rejected or silently skipped by strict parsers |
description (top-level) | LLM tool selection | LLM cannot match user intent to tool; tool is never invoked |
properties[x].description | Claude Desktop form, LLM prompt injection | User sees blank form fields; LLM uses wrong arguments |
required array | All clients | Optional fields treated as required; calls fail with missing param errors |
Why missing descriptions cause silent failures: When the LLM decides which tool to invoke, it reads the tool's description field as part of its context. A blank or vague description (like "summarize") competes poorly against tools with rich descriptions. The tool exists in the registry but is functionally invisible to the model. The fix is always a concrete sentence that describes input, output, and use case β exactly what was shown in Step 2.
The required array is not optional: If you omit required, some clients assume all fields are required. Others assume none are. The resulting behavior is unpredictable across clients. Always declare exactly which fields must be present, even if it is a single field.
Performance Analysis: Cold Starts, Schema Overhead, and SSE Connection Pooling
Understanding the performance characteristics of each transport helps you choose the right one before you hit problems under real load.
Cold start times by transport:
| Transport | Cold start latency | Per-call overhead | Max concurrent clients |
| stdio | 150β400 ms (Python process spawn + import) | ~0.1 ms (IPC) | 1 per spawning client |
| HTTP + SSE | 5β30 ms (HTTP connect to running process) | ~1β5 ms (TCP + headers) | Hundreds (asyncio event loop) |
| Docker + SSE | 500β2000 ms (first container start) | Same as HTTP + SSE | Same as HTTP + SSE |
Schema parsing overhead is negligible in practice: the tools/list response is typically under 2 KB even for servers with ten tools. Schema parsing completes in under 1 ms on every client tested. Optimize your tool handler, not your schema.
SSE connection pooling: HTTP+SSE maintains a persistent connection per client session. If you run behind a reverse proxy (nginx, Caddy), configure proxy_read_timeout to at least 300 seconds to prevent the proxy from closing idle SSE connections during long LLM calls. A closed SSE connection looks like a normal disconnect to the client β it will silently retry, but mid-call reconnects lose in-flight responses.
The dominant cost in any MCP server is always the tool handler itself. A gpt-4o-mini call takes 1β4 seconds. Optimizing transport overhead is like optimizing the envelope on a letter while the postal system takes three days. Focus on caching repeated LLM calls (functools.lru_cache for deterministic inputs, Redis for shared state) and using async HTTP clients everywhere.
π The Registration Journey: From Local Dev to Three Live Clients
The diagram below shows the complete path from a working Python function to a tool registered across all three clients. Each arrow represents a concrete step in this post.
flowchart TD
A["βοΈ Write tool function\n(Steps 1β2)"] --> B["π‘οΈ Add error handling\n(Step 3)"]
B --> C["π Test with MCP Inspector\n(Step 4)"]
C --> D{Which transport?}
D -->|"Local / single dev"| E["π Register via stdio\n(Step 8, 9, 10)"]
D -->|"Shared / remote"| F["π Switch to HTTP+SSE\n(Step 5)"]
F --> G["π Add bearer token auth\n(Step 6)"]
G --> H["π³ Package as Docker container\n(Step 7)"]
H --> I["π docker run -p 8080:8080"]
I --> J["π Register via SSE URL\n(Step 8, 9, 10)"]
E --> K["β
Claude Desktop sees tool"]
J --> K
K --> L["β
Cursor sees tool"]
L --> M["β
VS Code Copilot sees tool"]
M --> N["π Debug failures\n(Step 11)"]
The decision diamond at the centre is the key branch: stdio registration skips the Docker steps entirely and goes straight to client config files. SSE registration requires the running server (Steps 5β7) before the config files can point anywhere useful.
π Real-World Applications: How Teams Are Sharing Skills Across IDEs Today
The platform-agnostic code review assistant. A mid-size engineering team has developers split across Cursor, Claude Desktop, and VS Code. They built a single MCP server that wraps three tools: summarize_pr_diff (this post's example), lint_findings_summary (summarizes ESLint/Flake8 output), and test_coverage_report (describes coverage gaps in plain English). The server runs as a Railway-hosted Docker container. Each developer's config file points to the same SSE URL. Whether a developer uses Cursor's inline chat or Claude Desktop's sidebar, they call the same tools against the same backend β no per-tool configuration drift, no "works on my machine" breakdowns.
The private codebase search skill. A fintech team cannot send their internal codebase to external LLM APIs for semantic search. They run a local MCP server with a search_codebase tool that queries an internal Elasticsearch index. The server uses stdio transport so it never listens on a network port β OS process isolation is the security boundary. Each developer has the stdio config entry in their Cursor workspace config, and the tool is available in Copilot's /agent mode within their VS Code. The skill runs entirely on the developer's machine and never touches an external network.
The CI/CD summary bot. A DevOps team registered an MCP server in their GitHub Actions environment that calls summarize_deployment_diff to generate a plain-English deployment summary at the end of each pipeline run. The server is invoked by a Copilot-powered step in the workflow YAML. The output is posted as a PR comment. There is no human in the loop β the same MCP server that developers use interactively from their IDEs is also called headlessly in CI. One registration, two usage modes.
βοΈ Trade-offs and Failure Modes: What Breaks When You Register Across Clients
Every cross-client MCP deployment surfaces failure modes that do not appear in local testing. The table below covers the six most common, with the exact symptom you will see in each client, the underlying cause, and the fix.
| Failure | Symptom | Root Cause | Fix |
| Tool not appearing | Tool absent from client UI after restart | Malformed inputSchema (missing "type": "object") or server crash on startup | Run mcp dev server.py in Inspector; check Tools tab |
| Schema mismatch | "Missing required parameter" error on every call | required array lists a field that your handler treats as optional | Align required array with handler defaults; test with Inspector |
| Connection refused | "Failed to connect to MCP server" in Cursor/Claude | SSE server not running when client starts, or wrong port in config | Confirm docker run is active; verify port matches config URL |
| Auth 401 | Tool call returns "Unauthorized" or silent empty response | Bearer token in config does not match MCP_AUTH_TOKEN env var | Re-check token in config headers vs. server env; tokens are case-sensitive |
| Silent schema truncation | Tool appears but description is empty in UI | description field was null or omitted in list_tools return | Add a non-empty string to description in the Tool constructor |
| Stale tool list after update | Old tool signature still showing after server update | Client cached the capability manifest from the previous session | Restart the client (not just reload); some clients cache tools/list aggressively |
The most dangerous failure is the last one. Claude Desktop and Cursor both cache tool manifests across sessions. If you update your tool's inputSchema β add a parameter, change a description β restart the entire client application, not just the MCP connection. A running server with a new schema next to a cached old manifest causes unpredictable argument-passing behaviour that is very hard to trace.
π§ Decision Guide: stdio for Local, SSE for Shared, Container for Team
| Situation | Recommendation |
| Use stdio when | The tool is for your own local use, runs on the same machine as the client, and you need zero infrastructure setup. Configuration is a single JSON entry; no ports, no auth. |
| Use HTTP+SSE when | Multiple developers need the same tool, or the server must run on a remote host. SSE supports hundreds of concurrent clients and persistent streaming responses. |
| Containerize when | The server needs to be available outside business hours, deployed to a shared environment, or reproduced identically across dev/staging/prod. Docker eliminates Python version and dependency drift. |
| Avoid SSE without auth when | The server is exposed on any network interface beyond localhost. An unauthenticated MCP server on a shared LAN is a code-execution endpoint. |
| Avoid serverless (Lambda/Cloud Run) when | The tool has significant warm-up cost (model loading, connection pool establishment), or your SSE sessions last more than 15 minutes (Lambda's max execution timeout). |
| Use both transports in parallel when | You want local stdio for personal fast iteration and a shared SSE container for the team. The same server.py supports both β switch at startup via an environment variable. |
The simplest production pattern for a small team: one Railway or Fly.io container running SSE on port 8080, bearer token authentication via environment variable, and a single shared config snippet that each developer pastes into their client config file. Total infrastructure cost: one small container at ~$5/month.
π§ͺ Practical Walkthrough: Registering the PR Summarizer in Claude Desktop, Cursor, and VS Code
This section covers Steps 5β11: switching transports, adding auth, packaging in Docker, and writing the exact config file entries for each client.
Step 5 β Switch to HTTP+SSE Transport
Replace the stdio_server entrypoint with SSE transport. The tool handlers are unchanged β only the main() function changes:
# server.py β updated main() for SSE
import uvicorn
from mcp.server.sse import SseServerTransport
from starlette.applications import Starlette
from starlette.routing import Route
sse = SseServerTransport("/messages/")
async def handle_sse(request):
async with sse.connect_sse(request.scope, request.receive, request._send) as streams:
await server.run(*streams, server.create_initialization_options())
starlette_app = Starlette(routes=[Route("/sse", endpoint=handle_sse)])
if __name__ == "__main__":
uvicorn.run(starlette_app, host="0.0.0.0", port=8080)
Test it immediately with curl before adding auth:
python server.py &
curl -N http://localhost:8080/sse
# Should emit: data: {"type":"endpoint","uri":"/messages/?session_id=..."}
Step 6 β Add Bearer Token Authentication
Wrap the SSE route with a simple Starlette middleware that checks the Authorization header:
# server.py β auth middleware
import os
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.responses import JSONResponse
MCP_AUTH_TOKEN = os.environ.get("MCP_AUTH_TOKEN", "")
class BearerTokenMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request, call_next):
if not MCP_AUTH_TOKEN:
return await call_next(request)
auth = request.headers.get("Authorization", "")
if auth != f"Bearer {MCP_AUTH_TOKEN}":
return JSONResponse({"error": "Unauthorized"}, status_code=401)
return await call_next(request)
starlette_app = Starlette(
routes=[Route("/sse", endpoint=handle_sse)],
middleware=[Middleware(BearerTokenMiddleware)]
)
Set MCP_AUTH_TOKEN in your environment before starting the server:
export MCP_AUTH_TOKEN="my-secret-token"
python server.py
Step 7 β Package as a Docker Container
Use a multi-stage build to keep the image small. The first stage installs dependencies; the second stage copies only the runtime artifacts:
# Dockerfile
FROM python:3.12-slim AS builder
WORKDIR /app
COPY pyproject.toml .
RUN pip install --no-cache-dir "mcp>=1.0" fastmcp openai uvicorn starlette
FROM python:3.12-slim AS runtime
WORKDIR /app
COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin
COPY server.py .
EXPOSE 8080
CMD ["python", "server.py"]
Build and run:
docker build -t pr-summarizer-mcp .
docker run -d -p 8080:8080 \
-e OPENAI_API_KEY=sk-... \
-e MCP_AUTH_TOKEN=secret \
pr-summarizer-mcp
Verify the container is responding before configuring clients:
curl -N -H "Authorization: Bearer secret" http://localhost:8080/sse
Step 8 β Register in Claude Desktop
Claude Desktop reads its server registry from a JSON file in the OS application support directory.
File location:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
For stdio (local script):
{
"mcpServers": {
"pr-summarizer": {
"command": "python",
"args": ["/path/to/mcp-pr-summarizer/server.py"],
"env": {
"OPENAI_API_KEY": "sk-..."
}
}
}
}
For SSE (Docker container or remote):
{
"mcpServers": {
"pr-summarizer-remote": {
"url": "http://localhost:8080/sse",
"headers": {
"Authorization": "Bearer secret"
}
}
}
}
After saving the file, fully quit and reopen Claude Desktop (Cmd+Q on Mac, not just close the window). The tool should appear in the tool picker within the first new conversation.
Step 9 β Register in Cursor
Cursor reads MCP config from .cursor/mcp.json in your home directory (global) or in your project root (workspace-scoped):
{
"mcpServers": {
"pr-summarizer": {
"command": "python",
"args": ["server.py"],
"cwd": "/path/to/mcp-pr-summarizer",
"env": {
"OPENAI_API_KEY": "sk-..."
}
}
}
}
For the SSE variant, use the same url / headers format as Claude Desktop. Cursor respects both formats. Reload Cursor's window after saving (Cmd+Shift+P β "Developer: Reload Window").
Step 10 β Register in VS Code / GitHub Copilot Agent Mode
VS Code reads MCP config from .vscode/mcp.json in your workspace root. Note the slightly different schema: VS Code uses "type": "stdio" as an explicit discriminator field, and supports ${workspaceFolder} variable substitution in paths:
{
"servers": {
"pr-summarizer": {
"type": "stdio",
"command": "python",
"args": ["server.py"],
"cwd": "${workspaceFolder}/mcp-pr-summarizer",
"env": {
"OPENAI_API_KEY": "sk-..."
}
}
}
}
For SSE:
{
"servers": {
"pr-summarizer-remote": {
"type": "sse",
"url": "http://localhost:8080/sse",
"headers": {
"Authorization": "Bearer secret"
}
}
}
}
The tool becomes available in Copilot's @workspace agent mode. In the VS Code chat panel, open agent mode and type @workspace /summarize-pr-diff β Copilot will show the tool in its available tools list.
Step 11 β Debug Common Failures
When a tool does not appear after registration, work through this checklist in order:
- Check server startup: Run
python server.pymanually in a terminal. Any import error or missing env variable will be visible immediately. - Run MCP Inspector:
mcp dev server.py. Confirm the tool appears in the Tools tab before touching client configs. - Check the config file path: Claude Desktop will silently ignore a misplaced config file. Use the exact path for your OS.
- Check JSON syntax: A single misplaced comma in the config JSON will cause the entire registry to fail silently. Use a JSON linter.
- Restart the client fully: Not reload β fully quit and reopen. Claude Desktop especially caches manifests.
- Check SSE reachability: If using SSE, run
curl -N <url>from the same machine as the client before blaming the config.
π οΈ MCP Inspector: The Debugging Tool You'll Use Every Day
The MCP Inspector (mcp dev) is the single most useful tool in the MCP development workflow. It is open-source, ships with the mcp[cli] package, and runs entirely locally β no cloud account required.
What the Inspector UI shows:
- Tools tab: Every tool your server advertises via
list_tools, with the full JSON Schema rendered as a human-readable form. If a tool is missing here, the client will never see it. - Call panel: A form pre-filled from the schema. Submitting it sends a real
tools/callJSON-RPC request to your server and displays the raw response. This is the fastest way to confirm your error handling works correctly β send an emptydiffand verify that anMcpErrorcomes back, not a stack trace. - Messages tab: Full JSON-RPC traffic log for the session. When a client call fails mysteriously, paste the raw request/response from this tab into your debugging notes.
- Resources tab and Prompts tab: If your server exposes resources or prompt templates, they appear here for the same interactive testing.
Workflow tip: Keep Inspector open in one browser tab while you edit server.py. The mcp dev process auto-reloads on file save (hot-reload support was added in MCP SDK 1.2). You can iterate on schema descriptions and error messages without restarting the command.
The Inspector does not test client-specific behaviour β it uses a canonical MCP client implementation. If a tool works in Inspector but fails in Claude Desktop, the issue is almost always the client config file (wrong path, wrong JSON key, wrong transport type) rather than the server itself.
π Lessons Learned
After walking through eleven steps across three clients, here are the non-obvious lessons that save the most debugging time:
1. Write the docstring before the implementation. The tool description is the single most important field in your schema β not for humans, but for the LLM that routes calls to your tool. A vague description like "summarizes diffs" competes poorly against "Summarize a GitHub PR diff into a human-readable description with overview, key changes, and testing notes". Write the description first, run it through Inspector's simulated tool selection, then implement the handler.
2. Never let raw exceptions leave a handler. An unhandled exception in call_tool() causes some clients to mark the tool as failed and stop calling it for the session. Always wrap in McpError. This is not defensive programming β it is the MCP contract.
3. Test both transports before shipping. A tool that works perfectly over stdio may fail over SSE if it reads from stdin or relies on environment variables that the Docker container does not have. Run mcp dev server.py for stdio, then docker run for SSE, before registering in client configs.
4. Restart clients after schema changes, not just servers. Claude Desktop and Cursor cache the tools/list response. If you add a parameter to a tool and only restart the server, the client will call the old schema for the rest of the session. Always do a full client restart after any schema change.
5. The required array is your API contract. Treat it with the same discipline as a public REST API. Once a client caches your schema, removing a field from required is backwards-compatible. Adding a field to required is a breaking change that will break any client that cached the old schema.
6. Use environment variables for all secrets β never hardcode. The stdio config files (claude_desktop_config.json, .cursor/mcp.json) are checked into version control by some teams. An OPENAI_API_KEY hardcoded in the env block of a JSON config that lands in a public repo is a costly mistake. Use a .env file loaded by the server process, and document the required variables in your README.
π TLDR: Summary and Key Takeaways
TLDR: Turn any Python function into a multi-client MCP server in 11 steps β from annotation to Docker.
- The pattern is always the same: annotate β validate β test with Inspector β transport β auth β Docker β register. Every MCP server follows these eleven steps regardless of what the tool does.
- Tool schema is your public API. The
descriptionandinputSchemafields are what MCP clients and LLMs read to discover and invoke your tool. Incomplete schemas cause silent failures, not loud errors. - stdio and SSE are two faces of the same server. The tool handlers are identical. Only
main()changes. Choose the transport at deployment time, not at development time. - MCP Inspector (
mcp dev) is the first line of defense. If a tool works in Inspector, client-specific failures are almost always config file issues β wrong path, wrong JSON key, wrong URL. - Always
McpError, never bare exceptions. Unhandled exceptions in tool handlers cause clients to silently blacklist tools for the session. - Restart clients fully after schema changes. Claude Desktop and Cursor both cache
tools/listresponses across sessions. - One Docker container, three clients. The same SSE container registered via
url+headersinclaude_desktop_config.json,.cursor/mcp.json, and.vscode/mcp.jsongives every developer on your team access to the same skill simultaneously.
π Practice Quiz
Test your understanding of the MCP server deployment workflow.
You run
mcp dev server.pyand your tool appears in the Inspector's Tools tab. You then add it to Claude Desktop's config file, fully restart the app, and the tool does not appear. What is the most likely cause?a) The MCP Inspector cached the tool schema
b) TheinputSchemahas a top-level"type": "object"field
c) The config file path is wrong for your OS
d) The tooldescriptionis too longCorrect Answer: c β Claude Desktop silently ignores a config file at the wrong path. The Inspector passing confirms the server is valid; the client not seeing it almost always points to a configuration file issue.
Your
summarize_pr_difftool works perfectly over stdio but returns a 401 error when called over SSE from Cursor. What should you check first?a) Whether
OPENAI_API_KEYis set in the Docker container
b) Whether theAuthorizationheader in.cursor/mcp.jsonmatchesMCP_AUTH_TOKENin the server
c) Whether the SSE port is correct
d) Whether the tooldescriptionmatches the call intentCorrect Answer: b β A 401 specifically means the auth header was sent but did not match the server's expected token. Port issues produce "Connection refused", not 401.
You update your tool to add a new required parameter and restart only the server (not the client). A teammate reports the tool is behaving strangely with unexpected argument errors. Why?
a) The new parameter conflicts with a reserved MCP field name
b) TheMcpErrorcode for missing params is wrong
c) The client cached the old tool schema and is still calling with the old argument set
d) FastMCP does not support required parameter additionsCorrect Answer: c β MCP clients cache the
tools/listresponse. A schema change on the server is not visible to the client until the client is fully restarted and performs a freshtools/listcall.You want your MCP server to be available to ten developers simultaneously, with persistent state between calls and a shared LLM call cache. Which deployment approach should you use?
a) stdio, one instance per developer
b) HTTP+SSE in a Docker container with Redis-backed caching
c) stdio, with a shared Unix socket
d) Serverless (AWS Lambda) with SSE transportCorrect Answer: b β stdio spawns one process per client with no shared state. Lambda does not support persistent SSE connections for multi-minute LLM calls. HTTP+SSE in a container with Redis is the correct pattern for shared, stateful, multi-client access.
(Open-ended β no single correct answer) You are building a
review_pull_requestMCP tool that calls three LLM APIs sequentially: one for diff summarization, one for security analysis, and one for test coverage review. The combined latency is 8β12 seconds. How would you design the tool's error handling and response strategy to give the client the best experience during that wait? ConsiderMcpErrorcodes, streaming vs. batch responses, partial results, and what happens if the second LLM call fails after the first succeeds. This is a design challenge β describe your approach.
π Related Posts
- Headless Agents: How to Deploy Your Skills as an MCP Server β The conceptual companion to this post: why MCP exists, the three-layer architecture, and the stdio vs. SSE transport decision guide.
- Skills vs. LangChain, LangGraph, MCP, and Tools β How MCP tools compare to LangChain tool definitions, LangGraph node actions, and OpenAI function calling schemas.
- LLM Skill Registry, Routing, and Evaluation for Production Agents β How to manage, version, and route across a library of MCP skills in a production multi-agent system.

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
Software Engineering Principles: Your Complete Learning Roadmap
TLDR: This roadmap organizes the Software Engineering Principles series into a problem-first learning path β starting with the code smell before the principle. New to SOLID? Start with Single Responsibility. Facing messy legacy code? Jump to the smel...
Machine Learning Fundamentals: Your Complete Learning Roadmap
TLDR: πΊοΈ Most ML courses dive into math formulas before explaining what problems they solve. This roadmap guides you through 9 essential posts across 3 phases: understanding ML fundamentals β mastering core algorithms β deploying production models. ...
Low-Level Design Guide: Your Complete Learning Roadmap
TLDR TLDR: LLD interviews ask you to design classes and interfaces β not databases and caches.This roadmap sequences 8 problems across two phases: Phase 1 (6 beginner posts) builds your core OOP vocabulary through increasingly complex domains; Phase...

LLM Engineering: Your Complete Learning Roadmap
TLDR: The LLM space moves so fast that engineers end up reading random blog posts and never build a mental model of how everything connects. This roadmap organizes 35+ LLM Engineering posts into 7 tra
