Model Context Protocol (MCP)
Goal: understand MCP as the standard way to expose and consume tools — so a set of tools (and data sources) can live behind a server and be reused across many apps and models, instead of being hand-wired into one program.
Where this fits: Sections 13–14 taught raw tool calling; Sections 15–16 made tool execution safe to isolate. MCP is the layer on top: a common protocol for connecting models to tools and data — think of it as a standard port (often described as “USB-C for AI”) rather than a new capability.
Reminder — the model bridge needs tool calling. The server and client we build run with no endpoint at all. Only the last step — letting the model drive the tools — needs your endpoint with tool calling enabled (vLLM auto tool choice), exactly as in Sections 13–14.
What MCP actually is
In Section 13 you wrote a tool schema, sent it to the model, and ran the matching Python function yourself. Everything lived in one program. MCP asks a simple question: what if the tools lived somewhere else — behind a server — so any app or model could discover and use them without re-wiring them by hand?
That’s the whole idea. MCP has three parts:
- a server that owns some tools (and data sources) and advertises them,
- a client that connects to the server and relays calls, and
- a transport between them — here, stdio: the client launches the server as a subprocess and they exchange messages over stdin/stdout.
The messages are JSON-RPC 2.0, one JSON object per line. There are only three you need
to start. First the client says hello with initialize:
{"jsonrpc": "2.0", "id": 1, "method": "initialize",
"params": {"protocolVersion": "2024-11-05"}}
Then it asks what tools exist with tools/list — and the answer is exactly the kind of
tool schema you already wrote in Section 13:
{"jsonrpc": "2.0", "id": 2, "result": {"tools": [
{"name": "calculate",
"description": "Evaluate an arithmetic expression and return the number.",
"inputSchema": {"type": "object",
"properties": {"expression": {"type": "string"}},
"required": ["expression"]}}]}}
And it runs one with tools/call — which is just “run the matching function,” the work you
did by hand in Section 13:
{"jsonrpc": "2.0", "id": 3, "method": "tools/call",
"params": {"name": "calculate", "arguments": {"expression": "(12 + 5) * 3"}}}
So MCP isn’t a new capability. tools/list ≈ the schemas you send the model; tools/call
≈ running the function. MCP just puts a standard envelope around them so the tools can live
behind a server and be reused.
Build a tiny MCP server
Let’s make this concrete by hand-rolling a server — no library, just stdlib. Create
work/mcp_server.py. It exposes two tools: the Section 13/14 calculate, and a small
doc_search over an in-memory dict.
import ast, json, operator, sys
_OPS = {ast.Add: operator.add, ast.Sub: operator.sub, ast.Mult: operator.mul,
ast.Div: operator.truediv, ast.Pow: operator.pow, ast.USub: operator.neg}
def calculate(expression: str) -> str:
def ev(n):
if isinstance(n, ast.Constant) and isinstance(n.value, (int, float)): return n.value
if isinstance(n, ast.BinOp) and type(n.op) in _OPS: return _OPS[type(n.op)](ev(n.left), ev(n.right))
if isinstance(n, ast.UnaryOp) and type(n.op) in _OPS: return _OPS[type(n.op)](ev(n.operand))
raise ValueError("unsupported expression")
return str(ev(ast.parse(expression, mode="eval").body))
A trivial second tool, plus the registry the server advertises — each entry pairs the function with the schema a client (or model) will see:
_DOCS = {"mcp": "MCP connects models to tools and data.",
"sandbox": "A sandbox isolates untrusted code from the host."}
def doc_search(query: str) -> str:
for key, text in _DOCS.items():
if key in query.lower():
return text
return "no matching document"
TOOLS = {
"calculate": (calculate, {
"name": "calculate",
"description": "Evaluate an arithmetic expression and return the number.",
"inputSchema": {"type": "object",
"properties": {"expression": {"type": "string"}},
"required": ["expression"]}}),
"doc_search": (doc_search, {
"name": "doc_search",
"description": "Look up a short definition for a topic.",
"inputSchema": {"type": "object",
"properties": {"query": {"type": "string"}},
"required": ["query"]}}),
}
Now the heart: turn one JSON-RPC request into a response. This is just a dispatch on
method — the three we saw above:
def handle(request):
method, req_id = request.get("method"), request.get("id")
if req_id is None: # a notification (no id) -- nothing to reply
return None
def ok(result): return {"jsonrpc": "2.0", "id": req_id, "result": result}
def err(code, msg): return {"jsonrpc": "2.0", "id": req_id,
"error": {"code": code, "message": msg}}
if method == "initialize":
return ok({"protocolVersion": "2024-11-05",
"serverInfo": {"name": "course-mcp", "version": "0.1"},
"capabilities": {"tools": {}}})
if method == "tools/list":
return ok({"tools": [schema for _, schema in TOOLS.values()]})
if method == "tools/call":
params = request.get("params")
if not isinstance(params, dict):
return err(-32602, "params must be an object")
entry = TOOLS.get(params.get("name"))
if entry is None:
return err(-32602, f"unknown tool: {params.get('name')}")
try: # a tool error is a result, not a crashed server
text = entry[0](**(params.get("arguments") or {}))
except Exception as e:
return ok({"content": [{"type": "text", "text": f"error: {e}"}], "isError": True})
return ok({"content": [{"type": "text", "text": str(text)}]})
return err(-32601, f"method not found: {method}")
Finally the transport loop — read a line, dispatch, write a line:
def main():
for line in sys.stdin: # one JSON object per line; EOF ends the loop
line = line.strip()
if not line:
continue
try:
request = json.loads(line)
except json.JSONDecodeError:
continue
if not isinstance(request, dict):
continue
response = handle(request)
if response is not None:
sys.stdout.write(json.dumps(response) + "\n")
sys.stdout.flush()
if __name__ == "__main__":
main()
It needs no model and no endpoint. Run it directly with empty stdin and it exits cleanly at EOF:
printf '' | python work/mcp_server.py # clean exit; it's meant to be spawned
What we left out. This is tools-only MCP, kept minimal to show the shape. Real MCP also exposes resources and prompts, negotiates the protocol version, and returns proper JSON-RPC parse errors; our server silently skips malformed lines and treats a missing
idas a notification. Enough to be honest about what MCP is, not a spec-complete implementation. (Reference:examples/17/mcp_server.py.)
Drive it with a client
A client launches the server as a subprocess and talks JSON-RPC to it. Create
work/mcp_client.py — the “consume an MCP server” half, still with no model in sight.
import json, subprocess, sys
from pathlib import Path
SERVER = str(Path(__file__).resolve().parent / "mcp_server.py")
class MCPClient:
def __init__(self, server_path):
self.proc = subprocess.Popen([sys.executable, server_path],
stdin=subprocess.PIPE, stdout=subprocess.PIPE, text=True)
self._id = 0
def call(self, method, params=None):
self._id += 1
request = {"jsonrpc": "2.0", "id": self._id, "method": method}
if params is not None:
request["params"] = params
self.proc.stdin.write(json.dumps(request) + "\n")
self.proc.stdin.flush()
return json.loads(self.proc.stdout.readline())
def close(self):
self.proc.stdin.close()
self.proc.wait(timeout=5)
Now use it — initialize, list, call:
def main():
client = MCPClient(SERVER)
try:
init = client.call("initialize", {"protocolVersion": "2024-11-05"})
print("connected to:", init["result"]["serverInfo"])
for t in client.call("tools/list")["result"]["tools"]:
print(f" - {t['name']}: {t['description']}")
reply = client.call("tools/call",
{"name": "calculate", "arguments": {"expression": "(12 + 5) * 3"}})
print("calculate ->", reply["result"]["content"][0]["text"])
finally:
client.close()
if __name__ == "__main__":
main()
python work/mcp_client.py
You’ll see the server’s two tools listed and calculate -> 51. No endpoint touched —
discovering and calling MCP tools is pure plumbing. (Reference:
examples/17/mcp_client.py
.)
Bridge MCP tools into the tool-use loop
The payoff: let the model use the server’s tools. The Section 14 loop already does the
hard part — we only change two things. First, build the model’s tool schemas from the
server’s tools/list instead of writing them by hand. Second, when the model asks for a
tool, dispatch the call over MCP (tools/call) instead of to a local function.
Create work/mcp_bridge.py. Reuse the client we just built, and translate one MCP
advertisement into the OpenAI schema from Section 13:
import json, os, sys
from pathlib import Path
sys.path.append(str(Path(__file__).resolve().parents[1])) # the examples/ dir
sys.path.append(str(Path(__file__).resolve().parent)) # this folder, for mcp_client
from mcp_client import MCPClient
SERVER = str(Path(__file__).resolve().parent / "mcp_server.py")
def to_openai_schema(mcp_tool):
return {"type": "function", "function": {
"name": mcp_tool["name"], "description": mcp_tool["description"],
"parameters": mcp_tool["inputSchema"]}}
The loop is Section 14, with the tool list discovered over MCP and each call dispatched to the server. Note the guard: we never dispatch a name the server didn’t advertise — the server is a trust boundary, and the model’s chosen name is untrusted:
def run_agent(client, model, mcp, question, max_steps=5):
advertised = mcp.call("tools/list")["result"]["tools"]
tool_schemas = [to_openai_schema(t) for t in advertised]
names = {t["name"] for t in advertised}
messages = [{"role": "user", "content": question}]
for step in range(max_steps):
response = client.chat.completions.create(
model=model, messages=messages, tools=tool_schemas, tool_choice="auto")
msg = response.choices[0].message
if not msg.tool_calls:
return msg.content
messages.append({"role": "assistant", "content": msg.content,
"tool_calls": [tc.model_dump() for tc in msg.tool_calls]})
for tc in msg.tool_calls:
name = tc.function.name
if name not in names: # reject tools the server didn't offer
result = f"error: tool '{name}' not advertised by the server"
else:
args = json.loads(tc.function.arguments)
reply = mcp.call("tools/call", {"name": name, "arguments": args})
result = reply["result"]["content"][0]["text"]
print(f" [step {step}] {name} -> {result}")
messages.append({"role": "tool", "tool_call_id": tc.id, "content": str(result)})
return "(stopped: reached max_steps)"
The model is the only part that needs the endpoint, so guard it and degrade gracefully — exactly the pattern from Section 16:
def main():
if not (os.environ.get("OPENAI_BASE_URL") and os.environ.get("OPENAI_API_KEY")):
print("OPENAI_* not set -- skipping the model bridge (server + client still run).")
return
from common import get_client, MODEL
client = get_client()
mcp = MCPClient(SERVER)
try:
print(run_agent(client, MODEL, mcp, "What is (12 + 5) * 3? Use the calculator tool."))
finally:
mcp.close()
if __name__ == "__main__":
main()
python work/mcp_bridge.py
With creds, the model calls calculate over MCP and answers 51. Notice what didn’t
change: the loop. Swap the server and the same loop drives a different set of tools — that’s
the reuse MCP buys you. (Reference: examples/17/mcp_bridge.py
.)
When MCP earns its keep
MCP is worth it when tools outlive one program: a calculator, a document search, a database gateway you want reused across many apps; a third-party ecosystem of servers you can plug in without writing glue; or a clean separation of concerns where one team owns the tools and another owns the app. For a single script with two local functions (Section 14), MCP is overkill — calling the functions directly is simpler and you should.
In production you wouldn’t hand-roll the protocol: reach for the official mcp Python
SDK, which handles the transport, schema validation, auth, and the resources/prompts we
skipped. We built it by hand here for the same reason Section 1 built raw HTTP before the
SDK — so you know what the library is doing for you.
Security: an MCP server is third-party code at the end of a connection. Treat its tool list, its arguments, and its outputs as untrusted (Section 20), scope its credentials narrowly, and execute anything it runs behind the isolation from Sections 15–16.
Challenges
- A third tool. Add a
get_length(text)tool to the server (function + registry entry). Runmcp_client.pywithout touching it. Success: the new tool shows up intools/listand atools/callreturns its result. - Reject an unadvertised tool. The bridge already guards against names the server never
offered. Prove it: synthesize a fake tool call for a made-up name (e.g. feed
run_agenta question that tempts a nonexistent tool, or call the dispatch path with a bogus name). Success: the bogus call returns anerror: … not advertisedresult to the model instead of being dispatched over MCP. - Sandbox a dangerous tool. Add a
run_python(code)tool to the server that runs the code behind the Section 15 sandbox (run_untrusted). Call it with an infinite loop. Success: the timeout kills it and the server returns an error result — the host is never at risk.
Recap
- MCP standardizes connecting models to tools and data —
tools/list≈ the schemas you already send,tools/call≈ running the function; it builds on tool calling (13–14), not replaces it. - The transport is plain JSON-RPC over stdio: a server, a client, and line-delimited messages. The server and client run with no endpoint; only the model bridge needs one.
- A server is a trust boundary: scope credentials, reject tools it didn’t advertise, validate I/O (Section 20), and sandbox anything it runs (Sections 15–16).
- Hand-rolling shows what MCP is; in production use the official
mcpSDK.
Next
Section 18 — Embeddings: we switch from acting to meaning — turning text into vectors with the embeddings endpoint and measuring similarity by hand, the foundation for retrieval.