Skip to content

test: add LCOWV2 feature flag and v2 LCOW test surface#2747

Open
shreyanshjain7174 wants to merge 2 commits into
microsoft:mainfrom
shreyanshjain7174:test/lcow-v2-feature-flag
Open

test: add LCOWV2 feature flag and v2 LCOW test surface#2747
shreyanshjain7174 wants to merge 2 commits into
microsoft:mainfrom
shreyanshjain7174:test/lcow-v2-feature-flag

Conversation

@shreyanshjain7174

@shreyanshjain7174 shreyanshjain7174 commented May 20, 2026

Copy link
Copy Markdown
Contributor

What

Adds an opt-in -feature LCOWV2 gate across the functional and cri-containerd test suites so the in-progress v2 LCOW controller (internal/controller/vm + internal/builder/vm/lcow) can be exercised end-to-end without disturbing the existing v1 LCOW pipeline.

Why

The v2 controller landed across recent PRs (#2627, #2629, the VM SCSI/VPMem controllers, etc.) but lacked a dedicated functional surface. Existing LCOW tests are tightly coupled to v1 *uvm.OptionsLCOW and call testuvm.CreateLCOW / CreateAndStartLCOWFromOpts — they cannot drive the v2 controller without refactoring every test. A feature flag lets us:

  • run the v1 pipeline unchanged on every PR;
  • add v2 tests incrementally under TestLCOW_V2_*;
  • let the same LCOWV2 gate select the v2 scenarios both here and in the downstream pipelines that actually run LCOW UVMs (see "Where v2 is actually validated" below).

Changes

Functional suite (test/functional)

  • LCOWV2 implies LCOW in TestMain so featureLCOW-gated tests are reachable; then defaultLCOWOptions calls requireV1Only and every v1 path short-circuits cleanly. Net effect under -feature LCOWV2: only the new TestLCOW_V2_* tests actually run.
  • New helpers_v2_test.go and lcow_v2_test.go covering the v2 surface via internal/builder/vm/lcow + the v2 controller in-process.
  • Export LCOWBootFilesPath from test/pkg/uvm so v2 tests can resolve boot files without going through v1 *uvm.OptionsLCOW.

cri-containerd suite (test/cri-containerd)

  • Mirror the LCOWV2-implies-LCOW pattern in TestMain.
  • Thread RuntimeHandler onto the CRI ImageSpec when pulling LCOW images so containerd selects the windows-lcow snapshotter and linux/amd64 platform (the sandbox-platform label alone is not honored by containerd ≥ 2.0).
  • Add the runhcs-lcow-v2 runtime handler constant and lcow_v2_test.go scaffold. This handler name is the integration contract with the downstream deploy configuration that registers the v2 shim — keeping it identical here and there means the same -feature LCOWV2 test set runs in both places.

Flag plumbing (test/pkg/flag)

  • Add IncludesExplicit and Include on IncludeExcludeStringSet so test TestMain hooks can implement feature implications safely after flag.Parse without breaking default-when-unset semantics.

CI (.github/workflows/ci.yml)

  • New Build and run functional testing binary (LCOWV2) step that invokes functional.test.exe -feature=LCOWV2 -exclude=LCOWIntegrity. Marked continue-on-error: true — see below.
  • Build and upload containerd-shim-lcow-v2.exe as a test artifact so downstream pipelines can consume it.

Where v2 is actually validated (testing architecture)

This is the part reviewers usually ask about, so calling it out explicitly:

  • GitHub CI (this repo) runs the functional suite on GitHub-hosted runners that have no nested virtualization. LCOW UVM tests therefore cannot boot a guest here — the existing v1 functional run already excludes them (-exclude=LCOW,LCOWIntegrity, "Windows uVM tests will be run on 1ES runner pool"). The new LCOWV2 step is in the same boat: it builds the v2 shim and exercises the in-process/host-side surface, but anything requiring a live LCOW UVM is skipped on these runners. That is why it is continue-on-error: true and not a merge gate.
  • Authoritative LCOW UVM validation for both v1 and v2 happens in the downstream ContainerPlat / azcri pipelines, which consume the hcsshim build artifacts (including containerd-shim-lcow-v2.exe from this PR) and run the functional/CloudTest lanes on hosts with nested virt (WS2022 / WS2025 images). Those lanes select the v2 scenarios through the same LCOWV2 feature gate this PR introduces.
  • CloudTest is not triggered from GitHub PRs today. Wiring GitHub PR events to CloudTest is future tooling work; until it lands, the GitHub LCOWV2 step stays a non-blocking smoke build and the real signal comes from the downstream pipelines.

In short: this PR provides the test scaffolding + feature gate; the downstream pipelines provide the execution environment. The two are kept in sync by the shared LCOWV2 flag and the runhcs-lcow-v2 handler name.

Out of scope (follow-up)

  • CRI v2 testing (Test_V2_LCOW_* in test/cri-containerd) requires a CI step that starts containerd with snapshotter = "windows-lcow" set on both the runhcs-lcow AND runhcs-lcow-v2 runtime blocks. That step is deferred to a follow-up PR alongside the integration-tests v2 setup.
  • Growing the TestLCOW_V2_* surface beyond the initial scaffold.
  • Making the GitHub LCOWV2 step a hard gate — only meaningful once GitHub-triggered CloudTest (or a nested-virt runner pool) is available.

Validation

  • Local: 10 v2 functional tests pass against the v2 controller via functional.test.exe -feature=LCOWV2 on a Windows host with nested virt.
  • v1 pipeline: unchanged — existing -feature LCOW invocations route to the same code paths as before.
  • Downstream: the v2 shim artifact and LCOWV2 gate are consumed by the internal ContainerPlat/azcri CloudTest lanes (WS2022/WS2025); that v2 lane is still being stabilized, which is the other reason the GitHub step is non-blocking.
  • CRI v2 tests: do not pass locally against vanilla containerd because the upstream CRI plugin hardcodes platforms.DefaultSpec() in image_pull.go and ignores the runtime_platforms mapping; gating those tests on CI is intentionally deferred.

DCO

Signed-off-by in commit.

@shreyanshjain7174 shreyanshjain7174 requested a review from a team as a code owner May 20, 2026 15:48
Introduce a `-feature LCOWV2` gate across the functional and
cri-containerd test suites so the v2 LCOW controller can be exercised
end-to-end without disturbing the existing v1 LCOW pipeline.

Functional suite:
* LCOWV2 implies LCOW in TestMain so featureLCOW-gated tests are
  reachable, then defaultLCOWOptions calls requireV1Only to short-
  circuit every v1 path cleanly. Net effect: only TestLCOW_V2_* runs.
* Add helpers_v2_test.go and lcow_v2_test.go covering the v2 surface
  via internal/builder/vm/lcow + the v2 controller in-process.
* Export LCOWBootFilesPath from test/pkg/uvm so v2 tests can resolve
  boot files without going through v1 *uvm.OptionsLCOW.

cri-containerd suite:
* Mirror the LCOWV2-implies-LCOW pattern; thread RuntimeHandler onto
  the CRI ImageSpec when pulling LCOW images so containerd selects
  the windows-lcow snapshotter and linux/amd64 platform (the sandbox-
  platform label alone is not honored by containerd >=2.0).
* Add lcow_v2_test.go and the runhcs-lcow-v2 runtime handler constant.

Flag plumbing:
* Add IncludesExplicit and Include to IncludeExcludeStringSet so test
  TestMain hooks can implement feature implications safely after
  flag.Parse without breaking default-when-unset semantics.

CI:
* New `Build and run functional testing binary (LCOWV2)` step that
  invokes functional.test.exe -feature=LCOWV2. continue-on-error while
  the v2 surface is being grown. CRI v2 testing is intentionally
  deferred to a follow-up alongside the integration-tests v2 setup.
* Build and upload containerd-shim-lcow-v2.exe as a test artifact.

Signed-off-by: Shreyansh Sancheti <shsancheti@microsoft.com>
v2.1.0 fails to typecheck cross-module imports from test/ that resolve
back into the parent module via a replace directive, producing:

  could not import github.com/Microsoft/hcsshim/internal/... (-: build
  constraints exclude all Go files in ...)

This only affects the (windows, test) lint matrix entry; the same
imports typecheck cleanly under (windows, "") and locally with v2.11.x.
v2.5+ resolves this.

Signed-off-by: Shreyansh Sancheti <shsancheti@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants