Add fork identity wait plumbing#298
Conversation
| // services use `restart` so the same code path works for boot (start a | ||
| // stopped service) and post-fork (stop+start to force a re-read of | ||
| // refreshed envs). | ||
| if !waitForForkIdentityIfEnabled(startupCtx, forkIdentityWait) { |
There was a problem hiding this comment.
Early POST loses payload
High Severity
With fork identity wait enabled, kernel-images-api starts early and can accept and persist a fork identity payload. The wrapper then deletes the payload file, leading to a race where an injected payload is immediately removed, causing injections to fail and boot to stall until retried.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 252875c. Configure here.
| restartAll("kernel-images-api") | ||
| if !forkIdentityWait { | ||
| restartAll("kernel-images-api") | ||
| } |
There was a problem hiding this comment.
API keeps pre-identity env
Medium Severity
In fork identity wait mode, kernel-images-api starts before identity is applied and isn't restarted. Since it loads configuration like S2 basin, token, and stream only at process start, fork-injected identity environment variables are not picked up. This can result in stale or disabled S2 telemetry streams.
Reviewed by Cursor Bugbot for commit 252875c. Configure here.
| w.WriteHeader(http.StatusAccepted) | ||
| return | ||
| } | ||
| } |
There was a problem hiding this comment.
Stale config without wait mode
Low Severity
When fork-identity wait is disabled, GET /internal/fork-identity/config returns 200 with JSON whenever fork-identity.json exists, without checking the applied marker. A leftover payload from a snapshot or prior run can expose the wrong instance or metro URL to consumers that treat 200 as authoritative.
Reviewed by Cursor Bugbot for commit 252875c. Configure here.
hiroTamada
left a comment
There was a problem hiding this comment.
approved — reviewed as opt-in plumbing.
everything is gated behind KERNEL_FORK_IDENTITY_WAIT. with it unset the boot path is unchanged: the if forkIdentityWait branches are skipped, kernel-images-api is still restarted in the identity phase, and the two new internal routes answer 409/404. traced the default path end to end — WaitEnabled() returns (false, nil) on empty env, so there's no error/fatal and no wait. the forkidentity lib is self-contained and unit-tested.
one non-blocking nit:
forkidentity/payload.goWaitAppliedMarkerandwrapper/fork_identity.gowaitForForkIdentityPayloadbusy-spin withruntime.Gosched()and no sleep, so they peg a core for up to the 30s timeout. the other wrapper wait loops (waitForSocket,waitForHTTPProbe) usetime.Sleep(20ms)— worth matching for consistency.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using high effort and found 1 potential issue.
There are 4 total unresolved issues (including 3 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit d905fac. Configure here.
| return nil, ctx.Err() | ||
| case <-time.After(20 * time.Millisecond): | ||
| } | ||
| } |
There was a problem hiding this comment.
No timeout waiting payload
Medium Severity
waitForForkIdentityPayload polls forever until a payload file appears or startupCtx is canceled. Unlike the POST handler’s 30-second apply wait, a missing or failed host injection leaves the wrapper stuck in startup with no automatic failure.
Reviewed by Cursor Bugbot for commit d905fac. Configure here.


Summary
KERNEL_FORK_IDENTITY_WAIT=truePOST /internal/fork-identityGET /internal/fork-identity/configforkidentitypayload/env/path helpersDefault Behavior
No behavior changes unless
KERNEL_FORK_IDENTITY_WAIT=trueis set.Without that env:
Tests
go test ./cmd/api ./cmd/wrapper ./lib/forkidentity -count=1git diff --checkNote
Medium Risk
Changes VM boot ordering and accepts host-injected identity secrets (JWT, URLs) on internal HTTP endpoints; risk is mitigated because the feature is off unless
KERNEL_FORK_IDENTITY_WAITis enabled.Overview
Adds optional fork identity wait mode (
KERNEL_FORK_IDENTITY_WAIT=true) so forked snapshot restores can keep Chrome and CDP warm while the control plane injects per-instance identity.A new
forkidentitylibrary coordinates payload files under/run/kernel/, maps JSON fields to process env (including JWT and metro URLs), and exposes extension-facing config. Internal API routes (outside OpenAPI):POST /internal/fork-identityaccepts a bounded JSON payload, writes it, and blocks until the wrapper marks identity applied;GET /internal/fork-identity/configreturns202while waiting and JSONinstanceName/metroApiUrlonce applied.The wrapper replaces the prior FORK HOOK placeholder: in wait mode it starts
kernel-images-apiafter Chromium is up, probes public CDP, stops envoy and polls for the payload file, applies env viaSetenv/Unsetenv, writes the applied marker, then runsinit-envoywithout restartingkernel-images-apiso CDP sessions stay connected. Shutdown cancels the startup wait via context.Default behavior is unchanged when the wait env is unset.
Reviewed by Cursor Bugbot for commit d905fac. Bugbot is set up for automated code reviews on this repo. Configure here.