Skip to content

supervisor v0.0.73 crashes in rootless Podman: drop_capability_bounding_set() EPERM with non-empty bounding set #2069

Description

@waynesun09

Agent Diagnostic

  • Investigated the outage triggered by supervisor :latest re-tagging to v0.0.73 on 2026-06-30
  • Traced the crash to drop_capability_bounding_set() in crates/openshell-supervisor-process/src/process.rs, introduced by PR fix(supervisor): drop sandbox child capability bounding set #2001
  • Ran a diagnostic GitHub Actions workflow on ubuntu-24.04 runners testing supervisor versions 0.0.63, 0.0.72, 0.0.73, and :latest
  • Confirmed apparmor_restrict_unprivileged_userns=1 is active on GHA runners — AppArmor denies CAP_SETPCAP operations inside rootless Podman user namespaces
  • Verified via dmesg audit log: apparmor="DENIED" operation="capable" profile="unprivileged_userns" capability=21 capname="sys_admin"
  • Performed A/B test: supervisor 0.0.63 and 0.0.72 create sandboxes successfully; 0.0.73 crashes during sandbox creation
  • Confirmed the same crash on macOS + Podman machine (Fedora 41 CoreOS aarch64)
  • Reviewed PR fix(supervisor): drop sandbox child capability bounding set #2001 comments — a reviewer explicitly warned: "this still fails closed incorrectly for the Podman path" and "the regression test skips when CAP_SETPCAP is unavailable, so it would not catch the Podman-relevant failure mode"
  • Did not use the repo's .agents/skills/ skills (investigation was done from a downstream consumer's perspective using GHA diagnostics, gh CLI, and code review of the relevant crates)
  • Agent could not resolve this — the fix requires changes to validate_capability_bounding_set_clear() in the supervisor crate

Description

OpenShell v0.0.73 supervisor crashes during sandbox creation in rootless Podman on hosts where AppArmor restricts unprivileged user namespaces (apparmor_restrict_unprivileged_userns=1, the default on Ubuntu 24.04).

PR #2001 added drop_capability_bounding_set() which calls capctl::caps::bounding::clear(), requiring effective CAP_SETPCAP. PR #2001 also added SETPCAP to the Podman driver's cap_add to provide this capability. However, on Ubuntu 24.04 with apparmor_restrict_unprivileged_userns=1, AppArmor transitions processes entering user namespaces (which rootless Podman creates) into the unprivileged_userns profile. This profile denies capability operations at the kernel level — so bounding::clear() returns EPERM even though Podman granted SETPCAP.

The fallback in validate_capability_bounding_set_clear() handles:

  • Ok(()) + empty set → success
  • EPERM + empty set → tolerated (set already clear)
  • EPERM + non-empty set → fatal error ← this is the unhandled case

In rootless Podman, the bounding set retains SYS_ADMIN, NET_ADMIN, SETPCAP, etc. from --cap-add, so the third branch fires and the supervisor exits.

This is distinct from #2068 (:latest pinning). #2068 addresses which version gets pulled; this issue addresses a crash bug in v0.0.73 that must be fixed for the version to work in rootless Podman environments.

Reproduction Steps

On any Ubuntu 24.04 host or GitHub Actions ubuntu-24.04 runner:

  1. Run a sandbox with supervisor v0.0.72 (pre-PR-2001) — succeeds:

    # Configure gateway.toml with:
    # supervisor_image = "ghcr.io/nvidia/openshell/supervisor:0.0.72"
    openshell sandbox create --from base
    # → sandbox Ready
  2. Run a sandbox with supervisor v0.0.73 (post-PR-2001) — crashes:

    # Configure gateway.toml with:
    # supervisor_image = "ghcr.io/nvidia/openshell/supervisor:0.0.73"
    openshell sandbox create --from base
    # → "sandbox is not ready" — supervisor exits with EPERM during drop_privileges()

The crash occurs during sandbox creation when drop_privileges() calls drop_capability_bounding_set() for a child process — not at startup or --version.

Environment

  • GitHub Actions ubuntu-24.04 runner (image 20260622.220.1, kernel 6.17.0-1018-azure)
  • apparmor_restrict_unprivileged_userns = 1 (Ubuntu 24.04 default)
  • Podman 4.9.3 (rootless)
  • Also reproduced on macOS + Podman machine (Fedora 41 CoreOS aarch64)
  • Supervisor image: ghcr.io/nvidia/openshell/supervisor:0.0.73 (= :latest as of 2026-06-30T15:31Z)

Logs

# AppArmor audit from dmesg on GHA runner:
audit: type=1400 apparmor="DENIED" operation="capable" class="cap"
  profile="unprivileged_userns" pid=2536 comm="unshare"
  capability=21 capname="sys_admin"

# Supervisor error (from issue #2067 report, same root cause):
WARN openshell_supervisor_network::proxy: host.openshell.internal maps to a non-link-local IP; trusted-gateway SSRF exemption disabled
WARN openshell_supervisor_process::netns: Failed to delete network namespace
Error: × Invalid argument (os error 22)

Suggested Fix

validate_capability_bounding_set_clear() needs a fourth branch for EPERM + non-empty bounding set:

  1. Log a warning and continue — the child is already constrained by seccomp + Landlock + the container's own restrictions
  2. Or: probe CAP_SETPCAP effectiveness before calling bounding::clear(), and skip when ineffective

The PR #2001 reviewer also noted: "The current regression test skips when CAP_SETPCAP is unavailable, so it would not catch the Podman-relevant failure mode." Adding a rootless Podman CI test target would prevent future regressions.

Workaround

Pin the supervisor image to a pre-v0.0.73 version in gateway.toml:

[podman]
supervisor_image = "ghcr.io/nvidia/openshell/supervisor:0.0.72"

Related

Agent-First Checklist

  • I pointed my agent at the repo and had it investigate this issue
  • I loaded relevant skills (e.g., debug-openshell-cluster, debug-inference, openshell-cli)
  • My agent could not resolve this — the diagnostic above explains why

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions