You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Spike: should the built-in compute drivers move to the external (out-of-process) model?
Spike/investigation spun out of the #1952 discussion. This poses the problem and the
questions to investigate — it deliberately proposes no concrete design. The output of
the spike is a recommendation on whether and in what order to pursue the migration,
which then feeds a design RFC if warranted.
Problem
Today the built-in compute drivers run two different ways:
In-process via the ComputeDriver trait: Docker, Podman, Kubernetes.
Out-of-process over compute_driver.proto: VM (gateway-spawned) and all third-party --compute-driver-socket drivers.
Maintaining both paths has a recurring cost: because Docker is in-process, it keeps growing hooks that reach into gateway-local state with no proto equivalent, which then have to be unwound for alignment. Two are in flight right now:
This spike investigates whether migrating the built-in drivers onto the external model (the one VM and third-party drivers already use) is worthwhile, and what it would require.
Why it might be worth doing
Reduce the core team's development/maintenance burden: collapse the two driver integration paths into one, so cross-cutting gateway work is done once instead of twice.
Align with OpenShell Drivers #1051's uniform-driver requirement: third-party drivers must be out-of-process, so out-of-process is already the third-party model; moving the built-ins onto it gives one model for first and third parties.
Explicitly not a reason: deployment footprint (binary size / supply-chain). That is #1943's lane (conditional compilation), achievable at compile time without any of this.
What we already know (context, not conclusions)
VM is the precedent: in-tree, out-of-process, gateway-launched, and needs no gateway listeners — so "out-of-process" does not imply "untrusted" or "needs special networking."
Constraints already surfaced: the driver transport is UDS-only today (no networked transport for a different-host driver); Docker provisioning uses host bind mounts (same-host); Kubernetes is the existing cross-host story.
Questions to investigate
Is the migration worth doing now, or deferring? What concretely triggers it?
Concrete solution design (how callback reachability is implemented, and any specific mechanism) is deliberately not proposed here. That belongs to a design RFC after this spike concludes the migration is worth pursuing. (An earlier, more detailed exploration — including candidate mechanisms — is preserved separately in docker-external-driver-design-exploration.md for reference; it is not part of this spike.)
Spike: should the built-in compute drivers move to the external (out-of-process) model?
Problem
Today the built-in compute drivers run two different ways:
ComputeDrivertrait: Docker, Podman, Kubernetes.compute_driver.proto: VM (gateway-spawned) and all third-party--compute-driver-socketdrivers.Maintaining both paths has a recurring cost: because Docker is in-process, it keeps growing hooks that reach into gateway-local state with no proto equivalent, which then have to be unwound for alignment. Two are in flight right now:
gateway_bind_addresses()— injecting gateway listeners (→ refactor: align driver ownership of gateway callback listeners #1952).SupervisorReadiness— consulting gateway-local supervisor-session state for readiness (→ refactor: make sandbox readiness gateway-owned across compute drivers #1951).This spike investigates whether migrating the built-in drivers onto the external model (the one VM and third-party drivers already use) is worthwhile, and what it would require.
Why it might be worth doing
Explicitly not a reason: deployment footprint (binary size / supply-chain). That is #1943's lane (conditional compilation), achievable at compile time without any of this.
What we already know (context, not conclusions)
Questions to investigate
Out of scope for this spike
Concrete solution design (how callback reachability is implemented, and any specific mechanism) is deliberately not proposed here. That belongs to a design RFC after this spike concludes the migration is worth pursuing. (An earlier, more detailed exploration — including candidate mechanisms — is preserved separately in
docker-external-driver-design-exploration.mdfor reference; it is not part of this spike.)Related