Remove dead public headful payloads#275
Merged
Merged
Conversation
sjmiller609
approved these changes
Jun 26, 2026
IlyaasK
added a commit
that referenced
this pull request
Jun 26, 2026
## Summary Port the public-safe parts of the merged private server e2e speedup/stability work into one public PR. What changed: - split the public server workflow into `test-server-unit` and `test-server-e2e` - keep Go build caching on the unit job only and install Playwright dependencies explicitly before e2e - force Docker-backed e2e container teardown instead of waiting for graceful test-container stop - add e2e container lifecycle timing logs for start/stop/readiness profiling - replace fixed sleeps in enterprise extension and MV3 tests with bounded condition waits - make Playwright daemon recovery wait on DevTools + execution recovery instead of a fixed sleep - isolate `cdpmonitor` subtests so they do not share monitor state - parallelize public-safe configure, restart, and recording e2e cases - replace external `example.com` / `google.com` transfer setup with deterministic browser profile fixture state - run recording audio analysis inside the already-running test container instead of extra short-lived Docker runs ## Public/private scope This intentionally ports only changes whose target files/tests already exist in the public repo. Skipped private-only work: - private persistence e2e consolidation / cleanup speedups touched `server/e2e/e2e_persist_login_test.go`, which does not exist in public - no private-only tests or private image names are added here ## Why The private test-speedup stack is merged and proved useful for separating real image failures from unrelated e2e flakes. Public currently has the same broad server e2e shape and several of the same tests, so this PR brings the public suite up to parity without changing production image/API behavior. This follows the same testfast/startfast direction as the private work: - remove fixed sleeps where the test can wait for a real condition - keep assertions strict and failure-readable - reduce short-lived container churn in tests - separate unit and e2e workflow work without creating special one-test shards - keep profiling data visible in normal e2e logs ## Timing Baseline is latest successful public `main` server workflow before this PR: run `28205858762` on `5af656d`. After is this PR's successful server workflow: run `28245446048` on `bda8701`. | metric | before | after | delta | | --- | ---: | ---: | ---: | | server workflow wall | 17m33s | 7m29s | -10m04s | | server test critical job wall | 11m16s single `test` job | 4m38s `test-server-e2e` job | -6m38s | | e2e package runtime | 522.015s | 203.792s | -318.223s | | unit job availability | blocked behind image builds + e2e in single job | 1m52s standalone `test-server-unit` | earlier signal | | enterprise extension e2e | ~90.8s per image | 27.1s headless / 29.5s headful | ~61s faster per image | | configure powerset e2e | 98.9s | longest case 14.4s, suite 14.4s wall | parallelized cases | | restart timing e2e | 118.4s | 37.2s headful / 36.2s headless wall | parallelized image cases | | transfer fixture tests | 16.9s / 17.4s / 19.6s | 10.3s / 10.4s / 10.9s | deterministic fixture | | recording audio test | 23.1s | 17.5s | no extra analysis containers | Image build timings are shown in validation but should not be treated as the primary test-speedup signal because Docker cache state varies by run. The most comparable signal is the e2e package runtime and the server test critical path. ## Validation Local: - `git diff --check origin/main...HEAD` - `yq '.' .github/workflows/server-test.yaml >/dev/null` - `cd server && go test ./e2e -run 'TestExtensionDownloadObservedRequiresUpdateXMLAndCRX|TestExtensionDownloadLogSince' -count=1` - `cd server && go test ./lib/cdpmonitor -run TestBindingAndTimeline -count=1` - `cd server && go test ./e2e -run TestDoesNotExist -count=0` - `cd server && make test-unit` - `cd server/e2e/playwright && corepack pnpm install --frozen-lockfile` - `cd server/e2e/playwright && corepack pnpm exec tsx index.ts` (usage/parse check) - `~/.agents/skills/autoreview/scripts/autoreview --mode local --prompt-file /tmp/public-e2e-speedups-review-notes.md` clean GitHub Actions on this PR: - run `28245446048`: passed - `build-headful / docker`: 1m15s - `build-headless / docker`: 2m49s - `test-server-unit`: 1m52s - `test-server-e2e`: 4m38s - e2e package runtime: 203.792s - launcher `test`: passed in 21s - scan and Socket: passed Notes: - this package does not include `typescript`, so there is no local `tsc --noEmit` gate to run ## Review notes - test/workflow-only; no runtime image Dockerfiles or production API handlers changed - no test assertions are loosened; sleeps become bounded waits on the same expected outcomes - public image cleanup PRs #274 and #275 already have green checks and Cursor Bugbot passing; they remain blocked only on review/merge policy <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Low Risk** > Test and CI workflow changes only; assertions stay strict and production runtime code is untouched. Main residual risk is heavier parallel e2e load on CI runners, which this PR already exercised successfully. > > **Overview** > Ports public-safe **server e2e speed and stability** work: faster teardown, less fixed sleeping, clearer CI split, and more parallel safe tests—without changing production images or API handlers. > > **CI (`server-test.yaml`)** adds workflow **concurrency** (cancel in-progress), splits the old combined job into **`test-server-unit`** (`make test-unit`, Go cache on) and **`test-server-e2e`** (`make test-e2e` after image builds, explicit **pnpm** PATH + Playwright install, Go cache off). > > **E2e harness** forces Docker container stop with **`StopTimeout(0)`** and logs **`[e2e-timing]`** for start/stop and readiness phases. > > **Flake reduction** replaces fixed sleeps with **bounded polls**: enterprise extension (policy, download logs, `chrome://extensions`), Playwright daemon recovery after Chromium restart (**`WaitDevTools`** + retry), and MV3 service-worker checks in Playwright (**`waitForFunction`**). > > **Other test changes**: **`t.Parallel()`** on selected configure/restart/recording cases; zip transfer bench uses a **deterministic S3 fixture** instead of external sites; recording audio analysis runs **ffmpeg/ffprobe inside the running container**; **`cdpmonitor`** binding/timeline subtests each get an isolated monitor via **`withMonitor`**. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit bda8701. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This is the public-repo equivalent of the private dead-payload cleanup PR, with the additional public-only tracked
server/apibinary removed.What changed:
images/chromium-headful/image-chromium/, including old demo HTML, Streamlit config, static-content files, tint2 config, and legacy startup scripts.COPY images/chromium-headful/image-chromium/ /from the public headful Dockerfile.server/apibinary artifact.server/apito.dockerignoreso local rebuilt API binaries do not get sent in Docker build context.Why
The removed
image-chromiumdirectory was an old payload copied directly into/during the headful image build. The current image no longer uses that legacy startup/demo path, and the broad root copy makes it easy for unrelated files underimage-chromiumto silently land in the final runtime image.The tracked
server/apibinary is a local build artifact. The Dockerfile compiles/copies the runtime API binary from the build stages; it does not need a checked-in executable underserver/api. Keeping it tracked increases repository size and can invalidate Docker build context/layers when the binary changes.Git History / Removal Rationale
entrypoint.sh,start_all.sh,xvfb_startup.sh,mutter_startup.sh,tint2_startup.sh5c71470, PR #13) from the old headful demo image. These scripts manually started Xvfb, tint2, mutter, x11vnc/noVNC, then launched a demo server.http_server.py,index.html,static_content/index.html,.streamlit/config.toml/only preserves stale files that are not part of the current browser/session path..config/tint2/*COPY images/chromium-headful/image-chromium/ /server/api0fba5a0(PR #148), alongside smooth mouse movement source changes. A later PR (9816e34, PR #164) addedserver/apito.gitignore, which strongly suggests it was recognized as a local build artifact but was already tracked..dockerignorecoverage so local rebuilds do not pollute Docker context.Validation
Ran locally after merging current
maininto this branch:git diff --check origin/main...HEADDOCKER_BUILDKIT=1 docker build --check -f images/chromium-headful/Dockerfile .cd server && go test ./e2e -run TestDoesNotExist -count=0~/.agents/skills/autoreview/scripts/autoreview --mode branch --base origin/mainAutoreview result:
GitHub Actions on the current-main refresh:
Unknowns / Final Gates
server/apibinary is removed from git, but local developers can still build a localserver/api;.gitignoreand now.dockerignorekeep it out of commits and Docker build context.Fast Docker Review
This PR follows the fast-build guidance by removing dead build context and a broad root copy from the headful image.
Against the checklist:
image-chromiumroot payload copy is removed, so files cannot silently land in/just because they sit under a legacy directory.server/apibinary is removed and added to.dockerignore, so local binary rebuilds do not invalidate image build context or layers.Note
Low Risk
Dead files and a broad root COPY are removed; the active supervised headful image path is unchanged and CI validated the Dockerfile build.
Overview
Removes legacy Computer Use Demo payload from the public headful image and stops shipping a checked-in API binary.
The entire
images/chromium-headful/image-chromium/tree is deleted (Streamlit/static HTML, tint2 panel config, Xvfb/tint2/mutter startup scripts, and related demo entrypoints). The headful Dockerfile no longer doesCOPY images/chromium-headful/image-chromium/ /; runtime files continue to be copied explicitly (Neko, supervisor, Envoy, built API fromserver-builder, etc.).The tracked
server/apiexecutable is removed from the repo, andserver/apiis added to.dockerignoreso local rebuilds do not bloat Docker build context or invalidate layers.Reviewed by Cursor Bugbot for commit fb7e962. Bugbot is set up for automated code reviews on this repo. Configure here.