Fix IndexError when nested JSON key ends with a trailing backslash#1873
Open
JSap0914 wants to merge 2 commits into
Open
Fix IndexError when nested JSON key ends with a trailing backslash#1873JSap0914 wants to merge 2 commits into
JSap0914 wants to merge 2 commits into
Conversation
tokenize() in httpie/cli/nested_json/parse.py guarded the backslash branch with `can_advance()`, which tests `cursor < len(source)`. Because we are already inside the `while can_advance()` loop that guard is always True, so when the input ends with a backslash the code immediately tries to read `source[cursor + 1]` and raises IndexError: string index out of range Fix: replace `can_advance()` with `cursor + 1 < len(source)` so the check actually tests for the existence of the *next* character. When no next character exists the backslash falls through to the else-branch and is appended to the buffer as a literal character, matching the behaviour of KeyValueArgType.tokenize() in argtypes.py. Adds a regression test that exercises both tokenize() and parse() with a key ending in a backslash without requiring a live server.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug
tokenize()inhttpie/cli/nested_json/parse.pyguards the backslash branch withcan_advance(), which evaluates tocursor < len(source). Because we are already inside thewhile can_advance():loop that guard is alwaysTrueat this point, so when the input string ends with a backslash the code immediately attemptssource[cursor + 1]and raises an unhandled exception:Minimal reproducer (no server needed):
Fix
Replace
can_advance()withcursor + 1 < len(source)so the check actually tests for the existence of the next character before dereferencing it. When no next character exists, the backslash falls through to theelsebranch and is appended to the buffer as a literal character — consistent with the behavior ofKeyValueArgType.tokenize()inargtypes.py, which usesnext(characters, '')to safely handle the same edge case.Verification
RED (before fix):
tokenize('key\\')raisesIndexError: string index out of rangeGREEN (after fix): returns
[Token(kind=TEXT, value='key\\', ...)]without error