Skip to content

Slim ServerMiddleware to (ctx, call_next) and add OpenTelemetryMiddleware#2941

Open
Kludex wants to merge 1 commit into
mainfrom
server-middleware-ctx-call-next
Open

Slim ServerMiddleware to (ctx, call_next) and add OpenTelemetryMiddleware#2941
Kludex wants to merge 1 commit into
mainfrom
server-middleware-ctx-call-next

Conversation

@Kludex

@Kludex Kludex commented Jun 21, 2026

Copy link
Copy Markdown
Member

What

Reshapes the context-tier ServerMiddleware interface and adds a context-tier OpenTelemetryMiddleware.

  • ServerMiddleware.__call__ goes from (ctx, method, params, call_next) to (ctx, call_next). method and params now live on ServerRequestContext as ctx.method and raw ctx.params.
  • CallNext takes the context: Callable[[ServerRequestContext], Awaitable[HandlerResult]]. Middleware passes it through with call_next(ctx), and can rewrite the inbound message via call_next(replace(ctx, params=...)).
  • New OpenTelemetryMiddleware (context-tier) in mcp/server/_otel.py, mirroring the span shape of the existing dispatch-tier otel_middleware, which is left intact. It spans both requests and notifications, setting jsonrpc.request.id only when present.

Why

The 4-arg signature duplicated method/params that already belong on the context, and call_next() could only observe, not rewrite. The new shape matches the pure-ASGI middleware model (single context in, pass it through, mutate before next) and lets middleware redirect params before the handler runs.

Notes

  • Raw ctx.params is deliberate: middleware runs before method lookup and validation (it wraps the failure path too), so no single validated type exists at that tier. Authors validate on demand with Model.model_validate(ctx.params or {}).
  • Documented in docs/migration.md (no compat shim - the 4-arg form never shipped in a release).
  • Reviewed with Codex; the notification request_id is None path was confirmed correct.

AI Disclaimer

This PR was developed with the assistance of either Claude or Codex. I've reviewed and verified the changes.

…dleware`

Move `method` and `params` onto `ServerRequestContext` so context-tier
middleware reads `ctx.method`/`ctx.params` instead of separate positional
args. `CallNext` now takes the context, so middleware can rewrite the
inbound message with `call_next(replace(ctx, params=...))`.

Add a context-tier `OpenTelemetryMiddleware` alongside the existing
dispatch-tier `otel_middleware`, which is left intact.
Comment thread tests/server/test_otel.py

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize I created those tests. I don't think they are necessary. We already have tests for otel.

Comment thread src/mcp/server/runner.py
at call time, so a middleware can rewrite it for the rest of the chain.
"""
call = inner
call: CallNext = inner

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary type hint.

Comment thread src/mcp/server/runner.py
Comment on lines +172 to +177
def _apply_middleware(
mw: ServerMiddleware[Any], call_next: CallNext, ctx: ServerRequestContext[Any, Any]
) -> Awaitable[HandlerResult]:
"""Adapt one middleware to the `CallNext` shape: bind `call_next`, take
`ctx` at call time so a rewritten context flows down the chain."""
return mw(ctx, call_next)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary?

Also, we should stop using "mw" in this code source!!! I don't need to guess variable names!

Comment thread src/mcp/server/runner.py
Comment on lines 293 to 300
raise MCPError.from_error_data(result)
return result

call = self._compose_server_middleware(ctx, method, params, _inner)
result = _dump_result(await call())
call = self._compose_server_middleware(_inner)
result = _dump_result(await call(ctx))
# TODO: reject resultType values outside {"complete", "input_required"} unless the
# corresponding extension is in this request's _meta clientCapabilities.extensions; the
# explicit MUST-reject is client-side (basic/index.mdx ResultType), this enforces it proactively.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The post-chain initialize state commit (if method == "initialize": ... self._negotiate_initialize(params)) reads the method/params locals captured before the middleware chain ran, while _inner and _handle_initialize use the possibly-rewritten ctx.method/ctx.params. A middleware that rewrites the initialize message via call_next(replace(ctx, params=...)) — the rewrite capability this PR introduces — gets an InitializeResult negotiated from the rewritten params while connection.protocol_version/client_params are committed from the originals, leaving connection state and the wire response disagreeing for the rest of the connection. The commit (and the is_spec_method/serialization keying) should use the final ctx, not the pre-chain locals.

Extended reasoning...

The bug

Inside _inner, the PR deliberately re-reads the message off the context — method, params = ctx.method, ctx.params — so a middleware that rewrites the inbound message via await call_next(replace(ctx, params=...)) is honored: self._handle_initialize(params) builds the InitializeResult (including the negotiated protocolVersion) from the rewritten params. But after the chain returns, the state commit at the bottom of _on_request runs:

if method == "initialize":
    self.connection.client_params, self.connection.protocol_version = self._negotiate_initialize(params)

Here method and params are the outer locals captured before call = self._compose_server_middleware(_inner) ran. The rewritten context is a new object created by dataclasses.replace() inside the middleware and never escapes the chain — the composed CallNext returns only the HandlerResult — so the commit cannot see the rewritten values.

Why nothing else prevents it

Rewriting initialize is squarely within the contract this PR advertises: the migration doc and the ServerMiddleware docstring both say middleware wraps initialize and may rewrite method/params with replace(ctx, ...). There is no guard or documentation saying initialize must not be rewritten, and the new tests only exercise rewriting tools/call arguments (test_passes_rewritten_context_through), so this seam is untested. The in-chain comment ("Read method/params off ctx so a middleware that rewrote them ... reaches lookup and the handler") shows rewrites are intended to be effective; the post-chain commit was simply left on the pre-rewrite locals.

Impact

The wire response and the committed connection state disagree: every subsequent _resolve_protocol_version call and the outbound serialize_server_result sieve key off the stale connection.protocol_version, and connection.client_params (e.g. clientInfo, capabilities) reflects values the handler never saw — silently, for the lifetime of the connection. The same staleness applies to the if method == "initialize" guard itself (a middleware rewriting ctx.method to or away from initialize makes the commit run, or not, inconsistently with what the handler actually did) and to the is_spec_method/method used for result serialization.

Step-by-step proof

  1. Register a middleware that clamps the protocol version: return await call_next(replace(ctx, params={**(ctx.params or {}), "protocolVersion": "2025-03-26"})).
  2. Client sends initialize with protocolVersion: "2025-11-25".
  3. _inner reads the rewritten ctx.params_handle_initialize negotiates and returns InitializeResult(protocol_version="2025-03-26") to the client.
  4. Back in _on_request, the commit calls self._negotiate_initialize(params) with the original params → connection.protocol_version = "2025-11-25".
  5. The client now believes the session is at 2025-03-26 while the server validates requests, sieves results, and gates spec methods at 2025-11-25 for every later message — connection state and the handshake the client received permanently disagree.

Fix

Key the post-chain commit off the final context rather than the pre-chain locals — e.g. perform the commit inside _inner (where the rewritten ctx is in scope; the chain-success-only semantics can be preserved by committing just before returning the handler result, or by surfacing the final ctx out of the chain), or have _inner stash the final ctx.method/ctx.params for the post-chain block to use.

Comment thread src/mcp/server/_otel.py
Comment on lines +31 to +38
with otel_span(
name=f"MCP handle {ctx.method}{f' {target}' if target else ''}",
kind=SpanKind.SERVER,
attributes=attributes,
context=extract_trace_context(ctx.meta or {}),
record_exception=False,
set_status_on_exception=False,
) as span:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 When the inbound message has no _meta/traceparent, extract_trace_context(ctx.meta or {}) returns a fresh empty Context, and passing that explicit context to otel_span overrides the ambient current context — so the span becomes an orphaned trace root instead of parenting to the already-current span (e.g. the dispatch-tier otel_middleware span that Server.run() installs by default). This diverges from the dispatch-tier middleware this class mirrors, which passes parent=None when params has no _meta. Pass context=None (or only pass the extracted context when ctx.meta actually carries trace headers) so the no-traceparent case attaches to the ambient context.

Extended reasoning...

The bug. OpenTelemetryMiddleware.__call__ always calls otel_span(..., context=extract_trace_context(ctx.meta or {})). extract_trace_context calls opentelemetry.propagate.extract(carrier); the W3C TraceContextTextMapPropagator creates a fresh empty Context() when no context argument is supplied and returns it unchanged when the carrier has no traceparent. So whenever the client does not propagate trace context in _meta (i.e. every non-OTel-instrumented client), the middleware starts its span with an explicit empty Context rather than context=None. In the OTel SDK, Tracer.start_span resolves the parent via trace.get_current_span(context): with an explicit context the ambient/current context is ignored, the lookup yields INVALID_SPAN, and the new span becomes a brand-new trace root. Only context=None falls back to the ambient current span.\n\nThe code path that triggers it. Server.run() (src/mcp/server/lowlevel/server.py:431) unconditionally installs the dispatch-tier otel_middleware in dispatch_middleware. That middleware wraps _on_request with start_as_current_span in the same task, so its span is the current ambient span around the entire context-tier middleware chain. A user who appends the new OpenTelemetryMiddleware to server.middleware and is hit by a client that sends no _meta therefore gets a context-tier span that is an orphaned root in a separate trace, instead of a child of the dispatch-tier (or any other ambient transport/ASGI) span.\n\nStep-by-step proof.\n1. A non-OTel client sends tools/call with no _meta.\n2. The composed dispatch chain runs otel_middleware: params has no _meta, so it passes parent=None, and start_as_current_span makes span A ("MCP handle tools/call ...") the current span.\n3. Inside span A, _on_request builds ctx with meta=None and runs the context-tier chain. OpenTelemetryMiddleware computes extract_trace_context(ctx.meta or {}) = extract({}) → a fresh empty Context().\n4. otel_span forwards that explicit empty context to start_as_current_span. The SDK resolves the parent from that context → INVALID_SPAN → span B is created as a new trace root with a new trace_id.\n5. Result: the same request produces two disjoint traces (A and B) instead of B nesting under A. With context=None in step 4, B would have parented to A.\n\nWhy existing code/tests don't catch it. test_extracts_trace_context_from_meta always injects a traceparent (the in-SDK test client path), and the notification test never asserts on span.parent, so the no-traceparent parenting behavior is unpinned.\n\nOn the counter-argument that this is the conventional server-instrumentation pattern. ASGI/WSGI instrumentations do attach the extracted (possibly empty) context, but they sit at the outermost edge of the process where there is no meaningful ambient span to detach from — that trade-off doesn't apply here, where the SDK itself installs an enclosing dispatch-tier span by default. It's also true that the dispatch-tier middleware extracts an explicit context whenever _meta is present even without a traceparent; but the divergence at issue is the no-_meta case (the common case for external clients), where the dispatch-tier middleware deliberately passes parent=None and this class — whose docstring says it mirrors that span shape — does not. The duplicate "MCP handle" spans when both tiers are enabled are documented and fine; what's broken is that they land in two unrelated traces rather than nesting, which is a telemetry-correctness defect in the very feature this PR adds, not merely a span-shape preference.\n\nFix. Pass context=None when ctx.meta carries no trace headers — e.g. only call extract_trace_context when ctx.meta contains traceparent (mirroring the dispatch-tier match on _meta), or change extract_trace_context to return None for an empty/header-less carrier. Impact is telemetry-only (no request-handling breakage), but it should be fixed in this PR.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant