Slim ServerMiddleware to (ctx, call_next) and add OpenTelemetryMiddleware#2941
Slim ServerMiddleware to (ctx, call_next) and add OpenTelemetryMiddleware#2941Kludex wants to merge 1 commit into
(ctx, call_next) and add OpenTelemetryMiddleware#2941Conversation
…dleware` Move `method` and `params` onto `ServerRequestContext` so context-tier middleware reads `ctx.method`/`ctx.params` instead of separate positional args. `CallNext` now takes the context, so middleware can rewrite the inbound message with `call_next(replace(ctx, params=...))`. Add a context-tier `OpenTelemetryMiddleware` alongside the existing dispatch-tier `otel_middleware`, which is left intact.
There was a problem hiding this comment.
I didn't realize I created those tests. I don't think they are necessary. We already have tests for otel.
| at call time, so a middleware can rewrite it for the rest of the chain. | ||
| """ | ||
| call = inner | ||
| call: CallNext = inner |
| def _apply_middleware( | ||
| mw: ServerMiddleware[Any], call_next: CallNext, ctx: ServerRequestContext[Any, Any] | ||
| ) -> Awaitable[HandlerResult]: | ||
| """Adapt one middleware to the `CallNext` shape: bind `call_next`, take | ||
| `ctx` at call time so a rewritten context flows down the chain.""" | ||
| return mw(ctx, call_next) |
There was a problem hiding this comment.
Is this necessary?
Also, we should stop using "mw" in this code source!!! I don't need to guess variable names!
| raise MCPError.from_error_data(result) | ||
| return result | ||
|
|
||
| call = self._compose_server_middleware(ctx, method, params, _inner) | ||
| result = _dump_result(await call()) | ||
| call = self._compose_server_middleware(_inner) | ||
| result = _dump_result(await call(ctx)) | ||
| # TODO: reject resultType values outside {"complete", "input_required"} unless the | ||
| # corresponding extension is in this request's _meta clientCapabilities.extensions; the | ||
| # explicit MUST-reject is client-side (basic/index.mdx ResultType), this enforces it proactively. |
There was a problem hiding this comment.
🔴 The post-chain initialize state commit (if method == "initialize": ... self._negotiate_initialize(params)) reads the method/params locals captured before the middleware chain ran, while _inner and _handle_initialize use the possibly-rewritten ctx.method/ctx.params. A middleware that rewrites the initialize message via call_next(replace(ctx, params=...)) — the rewrite capability this PR introduces — gets an InitializeResult negotiated from the rewritten params while connection.protocol_version/client_params are committed from the originals, leaving connection state and the wire response disagreeing for the rest of the connection. The commit (and the is_spec_method/serialization keying) should use the final ctx, not the pre-chain locals.
Extended reasoning...
The bug
Inside _inner, the PR deliberately re-reads the message off the context — method, params = ctx.method, ctx.params — so a middleware that rewrites the inbound message via await call_next(replace(ctx, params=...)) is honored: self._handle_initialize(params) builds the InitializeResult (including the negotiated protocolVersion) from the rewritten params. But after the chain returns, the state commit at the bottom of _on_request runs:
if method == "initialize":
self.connection.client_params, self.connection.protocol_version = self._negotiate_initialize(params)Here method and params are the outer locals captured before call = self._compose_server_middleware(_inner) ran. The rewritten context is a new object created by dataclasses.replace() inside the middleware and never escapes the chain — the composed CallNext returns only the HandlerResult — so the commit cannot see the rewritten values.
Why nothing else prevents it
Rewriting initialize is squarely within the contract this PR advertises: the migration doc and the ServerMiddleware docstring both say middleware wraps initialize and may rewrite method/params with replace(ctx, ...). There is no guard or documentation saying initialize must not be rewritten, and the new tests only exercise rewriting tools/call arguments (test_passes_rewritten_context_through), so this seam is untested. The in-chain comment ("Read method/params off ctx so a middleware that rewrote them ... reaches lookup and the handler") shows rewrites are intended to be effective; the post-chain commit was simply left on the pre-rewrite locals.
Impact
The wire response and the committed connection state disagree: every subsequent _resolve_protocol_version call and the outbound serialize_server_result sieve key off the stale connection.protocol_version, and connection.client_params (e.g. clientInfo, capabilities) reflects values the handler never saw — silently, for the lifetime of the connection. The same staleness applies to the if method == "initialize" guard itself (a middleware rewriting ctx.method to or away from initialize makes the commit run, or not, inconsistently with what the handler actually did) and to the is_spec_method/method used for result serialization.
Step-by-step proof
- Register a middleware that clamps the protocol version:
return await call_next(replace(ctx, params={**(ctx.params or {}), "protocolVersion": "2025-03-26"})). - Client sends
initializewithprotocolVersion: "2025-11-25". _innerreads the rewrittenctx.params→_handle_initializenegotiates and returnsInitializeResult(protocol_version="2025-03-26")to the client.- Back in
_on_request, the commit callsself._negotiate_initialize(params)with the original params →connection.protocol_version = "2025-11-25". - The client now believes the session is at 2025-03-26 while the server validates requests, sieves results, and gates spec methods at 2025-11-25 for every later message — connection state and the handshake the client received permanently disagree.
Fix
Key the post-chain commit off the final context rather than the pre-chain locals — e.g. perform the commit inside _inner (where the rewritten ctx is in scope; the chain-success-only semantics can be preserved by committing just before returning the handler result, or by surfacing the final ctx out of the chain), or have _inner stash the final ctx.method/ctx.params for the post-chain block to use.
| with otel_span( | ||
| name=f"MCP handle {ctx.method}{f' {target}' if target else ''}", | ||
| kind=SpanKind.SERVER, | ||
| attributes=attributes, | ||
| context=extract_trace_context(ctx.meta or {}), | ||
| record_exception=False, | ||
| set_status_on_exception=False, | ||
| ) as span: |
There was a problem hiding this comment.
🔴 When the inbound message has no _meta/traceparent, extract_trace_context(ctx.meta or {}) returns a fresh empty Context, and passing that explicit context to otel_span overrides the ambient current context — so the span becomes an orphaned trace root instead of parenting to the already-current span (e.g. the dispatch-tier otel_middleware span that Server.run() installs by default). This diverges from the dispatch-tier middleware this class mirrors, which passes parent=None when params has no _meta. Pass context=None (or only pass the extracted context when ctx.meta actually carries trace headers) so the no-traceparent case attaches to the ambient context.
Extended reasoning...
The bug. OpenTelemetryMiddleware.__call__ always calls otel_span(..., context=extract_trace_context(ctx.meta or {})). extract_trace_context calls opentelemetry.propagate.extract(carrier); the W3C TraceContextTextMapPropagator creates a fresh empty Context() when no context argument is supplied and returns it unchanged when the carrier has no traceparent. So whenever the client does not propagate trace context in _meta (i.e. every non-OTel-instrumented client), the middleware starts its span with an explicit empty Context rather than context=None. In the OTel SDK, Tracer.start_span resolves the parent via trace.get_current_span(context): with an explicit context the ambient/current context is ignored, the lookup yields INVALID_SPAN, and the new span becomes a brand-new trace root. Only context=None falls back to the ambient current span.\n\nThe code path that triggers it. Server.run() (src/mcp/server/lowlevel/server.py:431) unconditionally installs the dispatch-tier otel_middleware in dispatch_middleware. That middleware wraps _on_request with start_as_current_span in the same task, so its span is the current ambient span around the entire context-tier middleware chain. A user who appends the new OpenTelemetryMiddleware to server.middleware and is hit by a client that sends no _meta therefore gets a context-tier span that is an orphaned root in a separate trace, instead of a child of the dispatch-tier (or any other ambient transport/ASGI) span.\n\nStep-by-step proof.\n1. A non-OTel client sends tools/call with no _meta.\n2. The composed dispatch chain runs otel_middleware: params has no _meta, so it passes parent=None, and start_as_current_span makes span A ("MCP handle tools/call ...") the current span.\n3. Inside span A, _on_request builds ctx with meta=None and runs the context-tier chain. OpenTelemetryMiddleware computes extract_trace_context(ctx.meta or {}) = extract({}) → a fresh empty Context().\n4. otel_span forwards that explicit empty context to start_as_current_span. The SDK resolves the parent from that context → INVALID_SPAN → span B is created as a new trace root with a new trace_id.\n5. Result: the same request produces two disjoint traces (A and B) instead of B nesting under A. With context=None in step 4, B would have parented to A.\n\nWhy existing code/tests don't catch it. test_extracts_trace_context_from_meta always injects a traceparent (the in-SDK test client path), and the notification test never asserts on span.parent, so the no-traceparent parenting behavior is unpinned.\n\nOn the counter-argument that this is the conventional server-instrumentation pattern. ASGI/WSGI instrumentations do attach the extracted (possibly empty) context, but they sit at the outermost edge of the process where there is no meaningful ambient span to detach from — that trade-off doesn't apply here, where the SDK itself installs an enclosing dispatch-tier span by default. It's also true that the dispatch-tier middleware extracts an explicit context whenever _meta is present even without a traceparent; but the divergence at issue is the no-_meta case (the common case for external clients), where the dispatch-tier middleware deliberately passes parent=None and this class — whose docstring says it mirrors that span shape — does not. The duplicate "MCP handle" spans when both tiers are enabled are documented and fine; what's broken is that they land in two unrelated traces rather than nesting, which is a telemetry-correctness defect in the very feature this PR adds, not merely a span-shape preference.\n\nFix. Pass context=None when ctx.meta carries no trace headers — e.g. only call extract_trace_context when ctx.meta contains traceparent (mirroring the dispatch-tier match on _meta), or change extract_trace_context to return None for an empty/header-less carrier. Impact is telemetry-only (no request-handling breakage), but it should be fixed in this PR.
What
Reshapes the context-tier
ServerMiddlewareinterface and adds a context-tierOpenTelemetryMiddleware.ServerMiddleware.__call__goes from(ctx, method, params, call_next)to(ctx, call_next).methodandparamsnow live onServerRequestContextasctx.methodand rawctx.params.CallNexttakes the context:Callable[[ServerRequestContext], Awaitable[HandlerResult]]. Middleware passes it through withcall_next(ctx), and can rewrite the inbound message viacall_next(replace(ctx, params=...)).OpenTelemetryMiddleware(context-tier) inmcp/server/_otel.py, mirroring the span shape of the existing dispatch-tierotel_middleware, which is left intact. It spans both requests and notifications, settingjsonrpc.request.idonly when present.Why
The 4-arg signature duplicated
method/paramsthat already belong on the context, andcall_next()could only observe, not rewrite. The new shape matches the pure-ASGI middleware model (single context in, pass it through, mutate before next) and lets middleware redirect params before the handler runs.Notes
ctx.paramsis deliberate: middleware runs before method lookup and validation (it wraps the failure path too), so no single validated type exists at that tier. Authors validate on demand withModel.model_validate(ctx.params or {}).docs/migration.md(no compat shim - the 4-arg form never shipped in a release).request_id is Nonepath was confirmed correct.AI Disclaimer
This PR was developed with the assistance of either Claude or Codex. I've reviewed and verified the changes.