fix(website): avoid exponential pandoc parse on sidebar titles#14638
Merged
Conversation
Collaborator
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
The navigation envelope concentrated every injected inline title into a single hidden markdown paragraph. Pandoc's markdown reader backtracks exponentially over unresolved emphasis candidates inside consecutive bracketed inlines (jgm/pandoc#11687), so a site whose page titles merely contain double-underscore names (e.g. `Class.__method__()`) hung on every page render: ~20 such titles pushed parse time past 180s, with nothing user-visible to debug. Joining the spans with a blank line puts each title in its own paragraph, so pandoc parses each independently and parse time stays linear. This keeps markdown processing intact (shortcodes, icons, bold/code in titles still render), unlike escaping or raw-wrapping the injected content.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When a website sidebar lists many pages whose titles contain double-underscore (dunder) names — e.g.
Transcript.__getitem__()— rendering hangs. A site with ~20 such titles pushes a single page render past 180s, with nothing user-visible to debug.Root Cause
Sidebar titles are rendered through the navigation envelope: each title becomes a hidden inline span, the spans are concatenated into one markdown document, Pandoc renders it, and the results are read back (
src/core/markdown-pipeline.ts). Every title landed in a single paragraph of consecutive bracketed inlines. Pandoc's markdown reader backtracks exponentially over the unresolved_-emphasis candidates across consecutive bracketed inlines (jgm/pandoc#11687), so parse time blows up with the number of dunder-bearing titles.Fix
Join the spans with a blank line so each title is its own paragraph. Pandoc parses each independently and parse time stays linear. This keeps markdown processing in titles intact (shortcodes, icons, bold, code), unlike escaping or raw-wrapping the injected content — which the upstream thread rejected for breaking exactly those.
Tests
A unit test asserts the envelope separates inline spans with a blank line. A site smoke test generates the project on the fly (many dunder titles plus bold/code titles), renders it, and asserts titles render correctly with no spurious emphasis; it carries a render-time budget (a new optional
timeouton the test command) so a regression fails fast instead of hanging the suite.Fixes #14576