Skip to content

AIR CLI Integration: fast air list via active run index #5814

Open
riddhibhagwat-db wants to merge 1 commit into
air-list-paginationfrom
air-list-fast
Open

AIR CLI Integration: fast air list via active run index #5814
riddhibhagwat-db wants to merge 1 commit into
air-list-paginationfrom
air-list-fast

Conversation

@riddhibhagwat-db

Copy link
Copy Markdown

Changes

  1. Default to active-only; add --all-status (replaces --active). Plain air list previously scanned every run of every state through Jobs runs/list. It now lists only active runs by default; --all-status opts into all states.

  2. AiTrainingService index fast path for --all-status scoped to yourself. Instead of scanning the Jobs firehose, it fetches cheap (job_run_id, submit_time) pairs from GET /api/2.0/ai-training/workflows, orders by submit time, keeps the newest --limit, and surfaces only those via concurrent Jobs runs/get. If the index is unavailable, it silently falls back to the Jobs scan so the command never hard-fails. --all-users and other-user filters always use the scan (the index is per-user only).

  3. Terminal runs are immutable, so once hydrated, their row is cached; repeat --all-status calls skip runs/get + get-output + MLflow for those ids.

The runFetcher now wraps a listStrategy (jobsScanStrategy | indexStrategy) behind the same next(want)/exhausted contract, so the interactive table, JSON, and one-shot output paths are unchanged.

This PR also fixes a pre-existing recvcheck lint failure and a latent stale-loading guard bug in list_tui.go (fetch helpers converted to value receivers). The index path over-fetches (skips the newest-N truncation) when a --filter on task fields is active, so a filtered-out run can't shrink the result below --limit.

Why

air list in the Go CLI was noticeably slower than the Python AIR CLI — in both plain and --limit modes. The Python CLI's speed comes from three architectural choices, not a faster scan; this PR ports all three to reach parity.

Testing

  • Unit tests: index ordering/limit, 403/404-drop vs 500-propagate, parseSubmitTimeMs, cache-hit-skips-network, gate routing, silent fallback, filter over-fetch.
  • Acceptance: --all-status end-to-end (index → runs/get → get-output) renders fully populated columns.
  • gofmt, lint-q (0 issues), full air acceptance suite green.

…nal-run cache

Ports the Python AIR CLI's fast `air list`. Three changes, matching Python:

- Default to active-only; add `--all-status` (replaces `--active`). Plain
  `air list` no longer scans every run of every state, which was the main
  reason it lagged the Python CLI.

- For `--all-status` scoped to yourself, fetch cheap (job_run_id, submit_time)
  pairs from the AiTrainingService index (/api/2.0/ai-training/workflows),
  order by submit time, keep the newest --limit, and hydrate only those via
  Jobs runs/get (concurrent, per-run ACL-enforced: 403/404 drops the id, other
  errors propagate). If the index is unavailable, silently fall back to the
  Jobs scan. --all-users and other-user filters always use the scan.

- Cache hydrated terminal runs on disk (libs/cache, 60-day TTL) so repeat
  --all-status calls skip runs/get + get-output + MLflow for those ids.

The runFetcher now wraps a listStrategy (jobsScanStrategy | indexStrategy)
behind the same next(want)/exhausted contract, so the TUI, JSON, and one-shot
paths are unchanged. Also converts listModel's fetch helpers to value
receivers (fixes a recvcheck lint and a stale-`loading` guard bug).

Co-authored-by: Isaac
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Approval status: pending

/acceptance/experimental/air/ - needs approval

4 files changed
Eligible: @apeforest, @bfontain, @lu-wang-dl, @panchalhp-db, @vinchenzo-db, @maggiewang-db, @ben-hansen-db, @pardis-beikzadeh-db

/experimental/air/ - needs approval

11 files changed
Eligible: @apeforest, @bfontain, @lu-wang-dl, @panchalhp-db, @vinchenzo-db, @maggiewang-db, @ben-hansen-db, @pardis-beikzadeh-db

Any maintainer (@andrewnester, @anton-107, @denik, @pietern, @shreyas-goenka, @simonfaltum, @renaudhartert-db) can approve all areas.
See OWNERS for ownership rules.

@eng-dev-ecosystem-bot

Copy link
Copy Markdown
Collaborator

Integration test report

Commit: 7a64fc1

Run: 28624266049

Env 🔄​flaky 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
💚​ aws linux 10 13 261 1016 6:51
💚​ aws windows 10 13 263 1014 5:50
💚​ aws-ucws linux 10 13 357 930 5:54
💚​ aws-ucws windows 10 13 359 928 6:26
💚​ azure linux 4 15 264 1014 7:27
💚​ azure windows 4 15 266 1012 5:30
💚​ azure-ucws linux 4 15 362 926 8:03
🔄​ azure-ucws windows 3 3 15 362 924 7:23
💚​ gcp linux 4 15 260 1017 7:07
💚​ gcp windows 4 15 262 1015 7:03
25 interesting tests: 13 SKIP, 9 RECOVERED, 3 flaky
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
🔄​ TestAccept 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 🔄​f 💚​R 💚​R
🔄​ TestAccept/bundle/generate/auto-bind ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p
🔄​ TestAccept/bundle/generate/auto-bind/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p
🙈​ TestAccept/bundle/invariant/no_drift 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 💚​R 💚​R 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 💚​R 💚​R 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R
🙈​ TestAccept/bundle/resources/postgres_branches/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/replace_existing 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/update_protected 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/without_branch_id 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_projects/update_display_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/synced_database_tables/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_endpoints/drift/recreated_same_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_indexes/recreate/embedding_dimension 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/ssh/connection 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestFetchRepositoryInfoAPI_FromRepo 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestFetchRepositoryInfoAPI_FromRepo/root 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestFetchRepositoryInfoAPI_FromRepo/subdir 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
Top 25 slowest tests (at least 2 minutes):
duration env testname
4:48 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:18 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:18 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:06 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:06 azure-ucws linux TestSecretsPutSecretStringValue
3:58 azure linux TestSecretsPutSecretStringValue
3:53 azure-ucws windows TestSecretsPutSecretStringValue
3:38 gcp linux TestSecretsPutSecretStringValue
3:21 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:13 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:09 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:05 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:03 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:03 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:02 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:01 azure-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:01 aws linux TestSecretsPutSecretStringValue
2:56 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:54 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:47 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:44 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:43 azure-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:40 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:30 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:22 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants