Agent Diagnostic
The test-release-canary skill was used to inspect Release Canary run 28430931447 and its failed Ubuntu Snap job.
Only the Ubuntu Snap job failed. The Snap installed successfully and snap services openshell reported openshell.gateway as enabled and active, but the gateway was not listening on 127.0.0.1:17670. Both openshell gateway add and openshell status observed a connection refusal.
The immediately preceding canary passed. The repository diff between that successful run at a5161d0b and the failed run at a2268060 only changes GPU E2E files; it does not modify the gateway, Snap packaging, or release workflow. An earlier canary run, 28265323284, failed with the same Snap connection-refused signature. This indicates an existing startup race rather than a regression in a2268060.
The current sequence installs the Snap, which starts the gateway daemon, and only afterward connects the Docker and other interfaces. The service can therefore start before Docker access is available. The workflow then checks status immediately after the connections complete. An unpublished repository branch contains commit 1cebef25, which restarted openshell.gateway after connecting interfaces and reported that this fixed a locally reproduced connection failure.
Description
Actual behavior: Installing the OpenShell Snap can start openshell.gateway before required Snap interfaces are connected. After the interfaces are connected, the service can remain reported as active while its API endpoint refuses connections, causing installation verification and the release canary to fail intermittently.
Expected behavior: The Snap should recover automatically when a late-connected interface becomes available. Users should only need to install the Snap and connect the interfaces they intend to use; they should not need a separate manual gateway restart or readiness loop.
The fix must not make gateway startup conditional on Docker. OpenShell can use compute drivers other than Docker. A Docker connection hook may restart the gateway when Docker access is added, but normal backend-neutral gateway startup must remain intact.
Reproduction Steps
- Install the Docker Snap.
- Install an OpenShell Snap artifact with
sudo snap install <artifact>.snap --dangerous.
- Connect
openshell:docker, log-observe, system-observe, and ssh-keys as performed by .github/workflows/release-canary.yml.
- Immediately register
http://127.0.0.1:17670 and run openshell status.
- Observe that the Snap service may report active while port 17670 returns
Connection refused.
The behavior is intermittent; see the linked failed jobs for captured reproductions.
Environment
- Runner: GitHub-hosted Ubuntu 24.04.4
- snapd: 2.75.2+ubuntu24.04
- Docker Snap: 29.3.1
- OpenShell: 0.0.73-dev.3+ga2268060
- Failed run commit: a226806
Logs
Service Startup Current Notes
openshell.gateway enabled active -
Gateway is not reachable at http://127.0.0.1:17670
Error: × client error (Connect)
├─▶ tcp connect error
╰─▶ Connection refused (os error 111)
Suggested Direction
- Add a Snap interface connection hook for Docker that restarts the gateway after Docker access becomes available.
- Keep default gateway startup backend-neutral; do not require Docker to start the service.
- Consider stopping or reloading the service appropriately when the Docker interface is disconnected.
- Add packaging tests that verify hook presence, executable mode, and expected service command.
- Keep the existing release canary flow unchanged so it validates the user-facing install and connect sequence.
Definition of Done
Agent Diagnostic
The
test-release-canaryskill was used to inspect Release Canary run 28430931447 and its failed Ubuntu Snap job.Only the Ubuntu Snap job failed. The Snap installed successfully and
snap services openshellreportedopenshell.gatewayas enabled and active, but the gateway was not listening on127.0.0.1:17670. Bothopenshell gateway addandopenshell statusobserved a connection refusal.The immediately preceding canary passed. The repository diff between that successful run at
a5161d0band the failed run ata2268060only changes GPU E2E files; it does not modify the gateway, Snap packaging, or release workflow. An earlier canary run, 28265323284, failed with the same Snap connection-refused signature. This indicates an existing startup race rather than a regression ina2268060.The current sequence installs the Snap, which starts the gateway daemon, and only afterward connects the Docker and other interfaces. The service can therefore start before Docker access is available. The workflow then checks status immediately after the connections complete. An unpublished repository branch contains commit
1cebef25, which restartedopenshell.gatewayafter connecting interfaces and reported that this fixed a locally reproduced connection failure.Description
Actual behavior: Installing the OpenShell Snap can start
openshell.gatewaybefore required Snap interfaces are connected. After the interfaces are connected, the service can remain reported as active while its API endpoint refuses connections, causing installation verification and the release canary to fail intermittently.Expected behavior: The Snap should recover automatically when a late-connected interface becomes available. Users should only need to install the Snap and connect the interfaces they intend to use; they should not need a separate manual gateway restart or readiness loop.
The fix must not make gateway startup conditional on Docker. OpenShell can use compute drivers other than Docker. A Docker connection hook may restart the gateway when Docker access is added, but normal backend-neutral gateway startup must remain intact.
Reproduction Steps
sudo snap install <artifact>.snap --dangerous.openshell:docker,log-observe,system-observe, andssh-keysas performed by.github/workflows/release-canary.yml.http://127.0.0.1:17670and runopenshell status.Connection refused.The behavior is intermittent; see the linked failed jobs for captured reproductions.
Environment
Logs
Suggested Direction
Definition of Done
snap restartcommand.