OAuth client: harden SEP-2352/SEP-2350 edge cases; fix conformance comment#2936
OAuth client: harden SEP-2352/SEP-2350 edge cases; fix conformance comment#2936maxisbey wants to merge 2 commits into
Conversation
When --spec-version is omitted, the conformance harness picks the version per-scenario (LATEST_SPEC_VERSION for active scenarios, DRAFT_PROTOCOL_VERSION for draft-only ones), not a flat LATEST_SPEC_VERSION. The variable is still log-only today, but the stateless 2026 client path will branch on it.
Three follow-up fixes to the SEP-2352 / SEP-2350 work: - Backfill an omitted token-response `scope` with the requested scope (RFC 6749 §5.1/§6) before persisting, so the SEP-2350 step-up union still recovers prior grants after a process restart when the authorization server omits `scope` because granted == requested. Applied to both the authorization-code and refresh response handlers. - Clear cached `oauth_metadata` when the SEP-2352 issuer-mismatch block discards bound credentials, so a subsequent ASM-discovery failure for the new authorization server cannot leave the old server's registration/token endpoints in place for Step 4. - Re-evaluate the SEP-2352 issuer-binding check against `oauth_metadata.issuer` after ASM discovery on the legacy no-PRM path (PRM discovery failed, AS found via root well-known fallback), mirroring the existing stamping-side fallback so stale credentials are still discarded when the binding can only be learned post-ASM.
| logger.debug("Authorization server changed; discarding bound credentials and re-registering") | ||
| self.context.client_info = None | ||
| self.context.clear_tokens() | ||
| # Any cached AS metadata is for the old server; drop it so a failed | ||
| # rediscovery cannot leak the old registration/token endpoints into Step 4. | ||
| self.context.oauth_metadata = None | ||
|
|
||
| asm_discovery_urls = build_oauth_authorization_server_metadata_discovery_urls( | ||
| self.context.auth_server_url, self.context.server_url |
There was a problem hiding this comment.
🟡 When the issuer changes but ASM discovery for the new AS fails in the same flow, Step 4 still falls back to urljoin(resource-origin, '/register') and stamps client_information.issuer = auth_server_url (the new AS) on credentials actually registered with the old embedded AS — persisting exactly the mis-bound record the oauth_metadata = None clear is meant to prevent, and credentials_match_issuer will then accept it on every later 401 with no auto-recovery. Consider failing the flow (or skipping the issuer stamp / the legacy /register fallback) when the issuer changed but the new AS's metadata could not be discovered.
Extended reasoning...
The residual gap. The new self.context.oauth_metadata = None (oauth2.py:594-596) correctly stops a stale discovered registration_endpoint from being reused after an issuer change. But it only changes where the fallback registration goes, not the fact that a registration performed against an endpoint never derived from the new issuer is still stamped with the new issuer. If ASM discovery for the new AS fails in the same flow, Step 4 registers at the resource origin's legacy /register and binds that record to the new issuer, persisting it via storage.set_client_info before token exchange.
Concrete walkthrough (PRM-success branch, single 401 flow):
- Stored
client_infois bound to issuer A — the resource server's old embedded AS athttps://api.example.com/. The server has migrated: PRM now advertises AS B (https://new-as.example.com/), soauth_server_url = B(oauth2.py:576). - Lines 584-596:
credentials_match_issuerfails →client_info/tokens discarded,oauth_metadatacleared. So far so good. - Step 2 ASM discovery for B fails —
handle_auth_metadata_responsereturns(True, None)for 404s and(False, None)for non-4xx (utils.py:223-233), so the loop exits withoauth_metadatastillNone; no exception. The new post-ASM re-check (lines 619-633) is gated onauth_server_url is None, so it does not run here. - Step 4:
bound_issuer = auth_server_url = B(line 649). Withoauth_metadata is None,should_use_client_metadata_urlis False andcreate_client_registration_request(utils.py:284-287) falls back tourljoin("https://api.example.com", "/register")— in the single-origin-migration case, the old embedded AS, which still serves a live DCR endpoint. - Registration succeeds against the old AS, but
client_information.issuer = Bis stamped (line 674) and persisted viastorage.set_client_infobefore the authorization/token exchange, so the mis-bound record lands in storage even if the rest of the flow fails. (The same flow then also hits/authorizeand/tokenon the resource origin via the legacy fallbacks in_perform_authorization_code_grant/_get_token_endpoint, despite the resource explicitly advertising B.) - On every later 401, once B's metadata is discoverable again,
credentials_match_issuer(client_info, B, ...)is a simple string compare against the stamped issuer (utils.py:341-345) and returns True — so the SDK keeps presenting AS A'sclient_idto AS B indefinitely with no automatic re-registration. That sticky wedge is exactly what SEP-2352 binding was meant to auto-recover from, and the PR description's second bullet says this change prevents "persisting a mis-bound record".
Why existing safeguards don't catch it. The metadata clear stops the old AS's discovered registration_endpoint from leaking into Step 4, but the legacy origin-/register fallback reaches the same old AS in the common case where the old AS lived at the resource origin; and bound_issuer is taken from auth_server_url regardless of whether the registration endpoint was actually derived from that issuer.
On the pre-existing/scope objection. It's true that both ingredients (the origin-/register fallback and the bound_issuer = auth_server_url stamping from #2933) predate this PR, that the trigger is narrow (an AS migration plus complete well-known discovery failure for the new AS plus a still-live legacy /register at the resource origin, all in one flow), and that this PR strictly narrows the window rather than widening it. This is filed as a non-blocking nit for that reason. It still seems worth a comment because the PR touches this exact block and its stated goal for this change is to stop persisting a mis-bound record when rediscovery for the new AS fails — this is a second route to that same record that remains open.
Suggested fix. When the issuer changed (the line-591 block fired) but oauth_metadata is still None after Step 2, either fail the flow with an OAuthFlowError, or skip the legacy /register fallback / skip stamping issuer so a registration not derived from the new AS's metadata is never recorded as bound to it. Any of these keeps the auto-recovery property intact: a later 401 with working discovery would re-register cleanly instead of being wedged on the wrong client_id.
Three small follow-ups to #2933 / #2931 plus a comment correction from #2911.
Motivation and Context
Review of the merged SEP-2352 and SEP-2350 implementations turned up a few narrow paths where the new guarantees don't hold:
auth_server_url. When PRM discovery fails and the AS is found via the root/.well-known/oauth-authorization-serverfallback, the recorded issuer is never compared, so credentials bound to a previous AS are reused. The stamping side already falls back tooauth_metadata.issuerin this case; this adds the matching comparison.oauth_metadatafrom the old AS is left in place. If discovery for the new AS then fails, dynamic client registration runs against the old server'sregistration_endpointwhile stamping the new issuer, persisting a mis-bound record.scopefrom the token response when granted scope equals requested scope. After a process restart the reloaded token then hasscope=Noneandclient_metadata.scopehas reverted to its constructor value, so the step-up union collapses to just the challenged scope. Backfilling the requested scope before persisting (and carrying the prior scope forward on refresh, per §6) keeps the stored token self-describing.The conformance-client comment correction is doc-only: the harness picks
LATEST_SPEC_VERSIONvsDRAFT_PROTOCOL_VERSIONper-scenario when--spec-versionis omitted, not a flatLATEST_SPEC_VERSION. The value is still log-only today.How Has This Been Tested?
Three new tests in
tests/client/test_auth.py:test_handle_token_response_backfills_omitted_scope_from_requesttest_handle_refresh_response_carries_prior_scope_when_response_omits_ittest_issuer_binding_re_evaluated_after_asm_when_prm_discovery_failed— drives the auth-flow generator through PRM 404×2 → root ASM 200 with a mismatched issuer and asserts the next yield is a DCR request, not the stale client's authorize redirectThe existing
tests/interaction/auth/test_lifecycle.pySEP-2352 migration test exercises theoauth_metadata = Noneclear on the PRM-success path.Breaking Changes
None.
Types of changes
Checklist
Additional context
The 403 step-up path also skips PRM/ASM discovery entirely, so after a restart it can fall back to
/authorizeand/tokenon the resource server's origin rather than the AS. That fix needs the discovery sequence factored into a sub-generator both the 401 and 403 branches can drive — out of scope for this PR.AI Disclaimer