[VPEX][4/8] Add local-env formatting-preserving pyproject.toml merge#5827
Open
rugpanov wants to merge 4 commits into
Open
[VPEX][4/8] Add local-env formatting-preserving pyproject.toml merge#5827rugpanov wants to merge 4 commits into
rugpanov wants to merge 4 commits into
Conversation
Contributor
Waiting for approvalCould not determine reviewers from git history. Eligible reviewers: Suggestions based on git history. See OWNERS for ownership rules. |
Collaborator
Integration test reportCommit: 79113e4
21 interesting tests: 14 SKIP, 7 RECOVERED
Top 5 slowest tests (at least 2 minutes):
|
d867d1f to
64d8996
Compare
ccc19f1 to
92f1dda
Compare
64d8996 to
3fd0bfe
Compare
92f1dda to
b96bf8b
Compare
3fd0bfe to
ce01647
Compare
b96bf8b to
6477a4c
Compare
6477a4c to
f4cc1c7
Compare
797c4ac to
03b4a2b
Compare
56b496d to
7b23583
Compare
03b4a2b to
7eaf270
Compare
7b23583 to
a1644d9
Compare
Fourth in the stacked local-env series. merge.go rewrites only the env-owned sections of a pyproject.toml and preserves every other byte (comments, ordering, whitespace, CRLF). It updates requires-python and the databricks-connect pin in place, and maintains a marker-bracketed managed [tool.uv] constraint block. The operation is idempotent: feeding its own output back in is byte-identical. RenderFreshPyproject produces a complete managed file for a greenfield project. Two correctness properties this file has to get right, both covered by tests that parse the result as TOML rather than asserting on strings: - When the user's pyproject.toml already has a [tool.uv] table with a non-constraint key, the managed constraint-dependencies nests header-less inside that table instead of emitting a second [tool.uv] header (two headers for one table is invalid TOML that uv rejects). - Single- vs multi-line constraint-dependencies detection tracks real bracket depth outside strings and comments, so an opening line that contains a "]" inside an element (e.g. "requests[security]~=2.0") or a trailing comment is not misread as single-line and mis-stripped. Depends on the constraints PR for the Constraints type. Still dormant. Co-authored-by: Isaac
Review of the merge layer found the databricks-connect rewrite was not scoped to [dependency-groups].dev: - It walked every line of [dependency-groups] and rewrote the first databricks-connect element found, so a pin in a sibling group (docs/test) was clobbered instead of the dev entry. It now locates the dev assignment and edits only within that array's line span. - The single-line branch replaced the databricks-connect token anywhere on the dev line, including inside a trailing comment (user content). Replacement is now confined to the array portion (through its closing "]"); the trailing comment is preserved byte-for-byte. - mergeRequiresPython replaced the whole line, dropping an inline comment such as `requires-python = ">=3.10" # maintained by platform team`. It now reattaches the trailing comment, honoring the byte-preservation contract for everything outside the managed value. Adds tests for each: sibling-group untouched, comment not clobbered, inline comment preserved. Co-authored-by: Isaac
Round-2 review found the merge did not tolerate a trailing comment on a table header line (e.g. "[project] # note"). Two consequences: mergeRequiresPython could not find a commented [project] header (managed value silently not updated), and worse, the [dependency-groups] end bound could run past a commented sibling header, so a dev key in a following table was mistaken for [dependency-groups].dev and rewritten. tableHeaderRe now allows a trailing comment and a new headerName helper matches a header by its bracketed name ignoring the comment; both the table lookup and the [tool.uv] attachment check use it. Also documents that line endings are a whole-file property (a CRLF-anywhere file is emitted entirely as CRLF), which is faithful for real single-ending pyproject.toml files. Co-authored-by: Isaac
…ldren Round-3 review found tableHeaderRe did not match TOML array-of-tables headers like "[[tool.uv.index]]". A [tool.uv] table's end bound therefore ran through its [[tool.uv.index]] children, and the header-less managed constraint block could be inserted inside the last index item instead of under [tool.uv], producing wrong or invalid uv config. tableHeaderRe now matches both "[...]" and "[[...]]"; headerName returns the full "[[...]]" token so an array-of-tables header is never treated as the same table as its "[...]" parent. Adds a merge test with a [[tool.uv.index]] child. Co-authored-by: Isaac
7eaf270 to
79890ca
Compare
a1644d9 to
79113e4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
PR 4 of 8 in the stacked
databricks local-env python syncseries (see #5823 for the full plan). Stacked on #5826 — review #5823, #5824, #5826 first; this PR's diff is onlymerge.go+ its test.This is the trickiest self-contained logic in the feature, which is why it gets its own PR.
What this PR contains
merge.go— a formatting-preservingpyproject.tomlmerge. It rewrites only the env-owned sections and preserves every other byte (comments, ordering, whitespace, CRLF). It updatesrequires-pythonand thedatabricks-connectpin in place and maintains a marker-bracketed managed[tool.uv]constraint block. The merge is idempotent — feeding its own output back in is byte-identical.RenderFreshPyprojectproduces a complete managed file for a greenfield project.Two properties worth close review
Both are covered by tests that parse the result as TOML rather than asserting on substrings:
pyproject.tomlalready has a[tool.uv]table with a non-constraint key, the managedconstraint-dependenciesnests header-less inside that table rather than emitting a second[tool.uv]header — two headers for one table is invalid TOML thatuvrejects.bracketDepthDelta), so an opening line containing a]inside an element (e.g."requests[security]~=2.0") or a trailing comment is not misread as single-line and mis-stripped.Dependencies & surface
Depends on the constraints PR (#5826) for the
Constraintstype. Still dormant — nothing imports the package yet.This pull request and its description were written by Isaac.