Chapter 4 of 9 · Model Context Protocol
The handshake
every session begins with a negotiation. get it wrong and nothing works.
A host process starts. A server process starts. Neither knows what the other can do, what version of the protocol the other speaks, or whether either is even ready to take a request. Before any tool gets called, before any resource gets read, three messages cross the wire. Get them wrong and nothing else works.
That three-message exchange is the handshake. Most readers arrive thinking capabilitiesare a permission slip the server hands the host. They aren't. They're a mutual declaration — the AND of what both sides advertised. The opener below makes the misconception specific.
The three messages, named#
The handshake is exactly three messages, in this order:
- client → server · the initialize request, carrying the version the client speaks, who the client is, and what the client can do.
- server → client· the response, paired by id, carrying the server's version, who the server is, and the menu the server offers.
- client → server · a final notification —
notifications/initialized— with no id, no body to speak of, and no reply expected.
Two requests would be one too many. The third message is a notification because the client has nothing more to ask; it's just acknowledging the server's reply so the session can begin. The widget below plays the three in order — the load-bearing visual for the rest of the chapter.
What the initialize request carries#
Three fields land in params on the opening message: protocolVersion, clientInfo, and capabilities. Each is load-bearing in a different way.
protocolVersion
A date string — the version of the spec the client speaks. Today's spec is 2025-06-18; an earlier widely-deployed spec is 2024-11-05. The client puts down what it speaks; the server puts down what it accepts. If the server can't speak the client's version, it returns the version it does support — and the client decides whether to retry, downgrade, or terminate. We come back to this at the end.
clientInfo
Who the client is — a name and a version. { "name": "claude-desktop", "version": "0.9.4" }. Servers use this for diagnostics and for occasional version-gated behaviour (an old client may not handle a newer notification type). It's metadata; it doesn't change what messages are valid.
capabilities
What the client can do — its side of the contract. Client-side capabilities are things the server can ask of the client mid-session: roots (filesystem scope), sampling (server can ask the client's LLM to complete a prompt), elicitation (server can ask the user a question through the client). Each is a promise the client makes about the future shape of the session.
What the response adds#
The server's reply mirrors the request: its protocolVersion, a serverInfo with name and version, and capabilities — but this time, the server's side of the menu: tools, resources, prompts, logging. Each top-level key may carry sub-flags — tools.listChanged is the server saying I will send a notification when my tool list changes; the client's acceptance of the handshake is its agreement to receive that notification.
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"protocolVersion": "2025-06-18",
"serverInfo": {
"name": "filesystem-server",
"version": "1.2.0"
},
"capabilities": {
"tools": { "listChanged": true },
"resources": { "subscribe": true },
"prompts": {}
}
}
}That's the menu. The next widget makes the rule that follows from it tactile.
Capability negotiation — what survives the AND#
Each side advertises its own vocabulary; only what a side declared on its axis survives on that axis. A server that offers tools without declaring it never had it; a client that wants sampling without offering it can't be sampled from. Toggle either side — watch the negotiated session change.
The thing to feel here is that calling tools/list on a server that didn't advertise tools won't hang or ping or get rate-limited — it'll come back -32601 method not found, every time. The handshake is where that future error gets decided.
Why the third message is a notification#
After the server replies, the client sends one more thing: notifications/initialized. It looks like ceremony. It isn't.
The client and server agreed on a version and a capability set in the first two messages; what they haven't agreed on is when the session is open. The server doesn't know if the client is still parsing the response, deciding to terminate, or about to send the next request. The notification is the client's way of saying I have no further question; just acknowledging your reply.A request would expect a response, which would loop. A notification doesn't.
Until the server receives that third message, it SHOULD NOT send any other notifications and MUST NOT send requests other than ping. The notification is the gate. Past it, the negotiated set is live.
Version negotiation, in the margins#
The protocolVersion field is the smallest moving part of the handshake — and the one readers tend to over-think. The rule is short: the client puts down what it speaks, the server replies with the version it supports, and if they don't match the client decides whether to proceed or terminate. Compare:
The diff is small on the surface — an _meta field on capabilities, an optional title on serverInfo, a top-level instructions string the host can surface to the model. The point of the widget isn't the diff; it's the shape — version is a date the spec mints, both sides put down what they know, and the protocol terminates the session if they have nothing in common.
Comprehension check#
A client opens with protocolVersion: "2024-11-05". The server only supports 2025-06-18. Walk through what happens next, in two sentences: what does the server return, and what does the client do?
reveal answer
The server returns its initialize response with protocolVersion: "2025-06-18" — i.e., the version it does support. The client then either retries the handshake with 2025-06-18 (if it can speak it), or terminates the connection. There is no automatic downgrade and no silent fallback; the client owns the decision.
The session is open. Now what?#
Three messages crossed the wire. Both sides agreed on a version and a capability set, and the client's notification said go. The session is stateful from here on — the capabilities are locked, the version is fixed, and every later request leans on what the handshake decided.
Which leaves the question we've been deferring. The server listed tools, resources, prompts. What are these primitives the two sides keep promising each other — and how does the host pick which to invoke? That's chapter 5.