Skip to content

Regex yes/no-pattern has undocumented implication #151819

@ffrank

Description

@ffrank

Documentation

Apparently, when using a subexpression with the (?(id/name)yes-pattern|no-pattern) syntax that

  • references a previous capture group and
  • omits the optional no-pattern

and the "yes-pattern" is not matched, the construct will also cause the originally captured group to not match, even when the input would suggest that it should. Example:

re.search(r"(<)?\w+(?(1)>)", "<body>")
<re.Match object; span=(0, 6), match='<body>'>

The above matches the opening and closing brackets as expected. However:

re.search(r"(<)?\w+(?(1)>)", "<3")
<re.Match object; span=(1, 2), match='3'>

Only the 3 is matched; the < is not part of the matched substring, and even .groups()[0] of the result will be None.

This is generally helpful, like here, because I do only want the opening bracket matched if there is also a closing one.

But the description of the (?(id/name)yes-pattern|no-pattern) syntax does not mention this behavior at all. It only details the effect of the second group that includes the query for the first, not that the behavior of the first group will be affected also.

Metadata

Metadata

Assignees

No one assigned

    Labels

    docsDocumentation in the Doc dir
    No fields configured for issues without a type.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions