Skip to content

Download spark_python_task workspace files in bundle generate job#5799

Merged
janniklasrose merged 3 commits into
mainfrom
janniklasrose/generate-job-file-not-loading
Jul 3, 2026
Merged

Download spark_python_task workspace files in bundle generate job#5799
janniklasrose merged 3 commits into
mainfrom
janniklasrose/generate-job-file-not-loading

Conversation

@janniklasrose

Copy link
Copy Markdown
Contributor

Changes

bundle generate job only downloaded notebook tasks. Files referenced by spark_python_task were left as absolute /Workspace/... paths in the generated config, so the source file was never downloaded and the config wasn't portable.

It now downloads workspace files referenced by spark_python_task and rewrites them to a relative path, reusing the same markFileForDownload helper already used for pipeline libraries. Git-sourced files (source: GIT) and cloud URIs (dbfs:/, s3:/, adls:/, gcs:/) are left untouched.

Why

Reported by a user: generating a job with a notebook task and a spark_python_task downloaded only the notebook. The spark_python_task branch was simply never handled in MarkTaskForDownload.

Tests

  • Unit tests in bundle/generate/downloader_test.go covering the download+rewrite path and the skipped cases (cloud URI, source: GIT).
  • New acceptance test acceptance/bundle/generate/spark_python_task_job exercising the full CLI: a workspace-file task is downloaded and rewritten, a dbfs:/ cloud-URI task is preserved. Identical output on both terraform and direct engines.

This PR was written by Isaac, an AI coding agent.

bundle generate job only downloaded notebook tasks; spark_python_task
files were left as absolute /Workspace paths in the generated config.
Download workspace files referenced by spark_python_task and rewrite
them to a relative path, matching notebook handling. Git-sourced files
and cloud URIs (dbfs:/, s3:/, adls:/, gcs:/) are left untouched.

Co-authored-by: Isaac
@eng-dev-ecosystem-bot

eng-dev-ecosystem-bot commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Integration test report

Commit: 15267a1

Run: 28644418718

Env 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
💚​ aws linux 10 13 230 1042 4:03
💚​ aws windows 10 13 232 1040 4:05
💚​ aws-ucws linux 10 13 314 960 4:48
💚​ aws-ucws windows 10 13 316 958 4:05
💚​ azure linux 4 15 230 1041 4:11
💚​ azure windows 4 15 232 1039 3:59
💚​ azure-ucws linux 4 15 316 957 5:09
💚​ azure-ucws windows 4 15 318 955 4:51
💚​ gcp linux 4 15 229 1043 3:51
💚​ gcp windows 4 15 231 1041 3:47
23 interesting tests: 13 SKIP, 10 RECOVERED
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
💚​ TestAccept 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
🙈​ TestAccept/bundle/invariant/no_drift 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 💚​R 💚​R 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 💚​R 💚​R 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R
🙈​ TestAccept/bundle/resources/postgres_branches/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/replace_existing 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/update_protected 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/without_branch_id 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_projects/update_display_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/synced_database_tables/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_endpoints/drift/recreated_same_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_indexes/recreate/embedding_dimension 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/ssh/connection 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestFetchRepositoryInfoAPI_FromRepo 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestFetchRepositoryInfoAPI_FromRepo/root 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestFetchRepositoryInfoAPI_FromRepo/subdir 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
Top 5 slowest tests (at least 2 minutes):
duration env testname
3:05 azure windows TestAccept
3:04 aws windows TestAccept
3:02 gcp windows TestAccept
3:00 aws-ucws windows TestAccept
2:59 azure-ucws windows TestAccept

@janniklasrose janniklasrose added this pull request to the merge queue Jul 3, 2026
Merged via the queue into main with commit b2e7e6f Jul 3, 2026
25 checks passed
@janniklasrose janniklasrose deleted the janniklasrose/generate-job-file-not-loading branch July 3, 2026 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants