Skip to content

OPRUN-4392,OPRUN-4393: Add OLMv1 progress deadline QE tests + fixes#755

Open
dtfranz wants to merge 1 commit into
openshift:mainfrom
dtfranz:ocp-88331-88332-automation_dtfranz
Open

OPRUN-4392,OPRUN-4393: Add OLMv1 progress deadline QE tests + fixes#755
dtfranz wants to merge 1 commit into
openshift:mainfrom
dtfranz:ocp-88331-88332-automation_dtfranz

Conversation

@dtfranz

@dtfranz dtfranz commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Automate the ClusterExtension rollout failure coverage for OCP-88331 and OCP-88332 by building in-cluster bundle and catalog images for successful and failing bundle versions.

The new QE specs verify ProgressDeadlineExceeded on an initial failed rollout and ProbeFailure while upgrading to a bad revision under the BoxCutter runtime.

Supersedes #745

Additional changes:

  • Test respects the CRD minimum of 10 minutes for the timeout - upstream we would modify the CRD so we can do 1 minute, but we can't do that downstream.
  • Fixed the httpd script

Test pass shown here: PR, CI run

Summary by CodeRabbit

  • Tests
    • Added test coverage for cluster extension progress deadline behavior, including scenarios for deadline exceeded conditions and rollout failure handling.

Automate the ClusterExtension rollout failure coverage for OCP-88331 and OCP-88332 by building in-cluster bundle and catalog images for successful and failing bundle versions.

The new QE specs verify ProgressDeadlineExceeded on an initial failed rollout and ProbeFailure while upgrading to a bad revision under the BoxCutter runtime.

Signed-off-by: Daniel Franz <dfranz@redhat.com>
Co-authored-by: Bruno Andrade <bruno.balint@gmail.com>
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 23, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 23, 2026

Copy link
Copy Markdown

@dtfranz: This pull request references OPRUN-4392 which is a valid jira issue.

This pull request references OPRUN-4393 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Automate the ClusterExtension rollout failure coverage for OCP-88331 and OCP-88332 by building in-cluster bundle and catalog images for successful and failing bundle versions.

The new QE specs verify ProgressDeadlineExceeded on an initial failed rollout and ProbeFailure while upgrading to a bad revision under the BoxCutter runtime.

Supersedes #745

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Walkthrough

Adds a new Ginkgo/Gomega integration test file olmv1_ce_progress_deadline.go with 576 lines. It defines two test scenarios for ClusterExtension progress-deadline behavior under the NewOLMBoxCutterRuntime feature gate, plus helpers for fixture/RBAC setup, OCP image build pipeline (via BuildConfig/ImageStream), catalog/bundle manifest templates, and polling assertion utilities.

Changes

ClusterExtension progress deadline QE test

Layer / File(s) Summary
Test scenarios and fixture setup
openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go
Two It blocks: one asserts ProgressDeadlineExceeded on persistent rollout failure; the other asserts active revisions and probe failure on upgrade. rolloutFailureBundle/rolloutFailureFixture types and newRolloutFailureFixture/newClusterExtension helpers create namespace, RBAC, images, and a ClusterCatalog.
Bundle and catalog manifest templates
openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go
Constant string templates for bundle Dockerfile, OLM annotations, properties, script ConfigMap, ClusterServiceVersion (with deployment/probe/security context wiring), and catalog Dockerfile.
OCP image build pipeline
openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go
buildImage creates ImageStream/BuildConfig and runs oc start-build with a binary tar archive. createBuildArchive writes file maps to a temp tar. bundleImageFiles and catalogImageFiles assemble build contexts with placeholder substitution via replaceAll.
Polling assertion helpers and cleanup
openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go
expectClusterCatalogServing, expectClusterExtensionCondition, expectClusterObjectSetCondition, expectActiveRevisions, and the shared eventually wrapper; deleteObject ignores NotFound errors.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 12 | ❌ 3

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Line 264 includes large build output in error message, violating QE guideline: "Don't put large log outputs in error messages." The output variable from oc start-build can contain hundreds of l... Log the output separately with g.By() or use proper logging instead of including it directly in the Expect error message.
Ipv6 And Disconnected Network Test Compatibility ⚠️ Warning The test's httpGet health probes (startupProbe, readinessProbe, livenessProbe) omit explicit host specification, defaulting to 127.0.0.1 (IPv4) which may be unavailable in IPv6-only disconnected en... Add host: localhost or host: "::1" to httpGet probes, or bind python3 http.server to dual-stack properly with schema specification in probes. Alternatively, add [Skipped:Disconnected] if IPv6-only environment support is deferred.
✅ Passed checks (12 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically identifies the main change: adding OLMv1 progress deadline QE tests with references to the associated JIRA issues (OPRUN-4392, OPRUN-4393), which aligns with the changeset that introduces a comprehensive test spec file for ClusterExtension progress deadline behavior.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All Ginkgo test names are static and deterministic with no dynamic content (timestamps, UUIDs, generated identifiers, pod/namespace/node names). Test titles are descriptive and stable across runs.
Microshift Test Compatibility ✅ Passed Test is protected from MicroShift execution via exutil.SkipMicroshift(oc) call in BeforeEach (line 40), which skips test with "it does not support microshift" message. Despite using BuildConfig (bu...
Single Node Openshift (Sno) Test Compatibility ✅ Passed Test contains no multi-node assumptions: single-replica deployments, no affinity/topology constraints, no node scheduling or failover logic. SNO-compatible code patterns detected.
Topology-Aware Scheduling Compatibility ✅ Passed This PR adds a test specification file (not operator code) with test fixtures. The deployment manifests in the test are minimal (1 replica, no affinity/topology constraints), appropriate for test i...
Ote Binary Stdout Contract ✅ Passed File has no process-level stdout writes. All code executing outside test blocks (init, main, BeforeSuite, etc.) is absent. The buildImage function with output parameter is called only within It blo...
No-Weak-Crypto ✅ Passed File contains no weak cryptographic algorithms (MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB), custom crypto implementations, or non-constant-time secret comparisons.
Container-Privileges ✅ Passed Container manifest in the test file properly restricts privileges: runAsNonRoot: true, allowPrivilegeEscalation: false, and all capabilities dropped.
No-Sensitive-Data-In-Logs ✅ Passed No passwords, tokens, API keys, PII, session IDs, or customer data found in logging statements. Build output contains only standard OpenShift infrastructure hostnames, not sensitive credentials.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@openshift-ci openshift-ci Bot requested review from fgiudici and perdasilva June 23, 2026 13:07
@openshift-ci

openshift-ci Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dtfranz
Once this PR has been reviewed and has the lgtm label, please assign perdasilva for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go (1)

405-416: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Consider documenting the 12-minute timeout rationale.

The 12-minute timeout here is notably longer than other assertions (3-5 minutes). While this is correct for waiting beyond the 10-minute progress deadline set in test case 88331, a comment explaining the relationship would improve maintainability.

📝 Optional improvement
 func expectClusterObjectSetCondition(ctx context.Context, name, conditionType string, status metav1.ConditionStatus, reason string) {
+	// Timeout is 12 minutes to accommodate the 10-minute progress deadline in test case 88331,
+	// plus buffer time for controller processing and status updates.
 	eventually(func(g o.Gomega) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go` around
lines 405 - 416, Add a comment above the eventually function call in
expectClusterObjectSetCondition to document why the 12-minute timeout is used.
The comment should explain that this timeout is intentionally longer than other
assertions (3-5 minutes) to wait beyond the 10-minute progress deadline
configured in test case 88331, establishing the relationship between the timeout
value and the deadline requirement for maintainability.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go`:
- Around line 263-264: The error message in the o.Expect assertion for the oc
start-build command includes the full output variable, which can contain
extensive build logs violating QE guidelines. Remove the output variable from
the o.Expect error message assertion at the line where start-build is run and
Run method is called, and instead log the output separately using g.By() before
the assertion or truncate the output to a reasonable size if the error needs to
include diagnostic information. This ensures large build logs are handled
through proper logging mechanisms rather than being embedded in the error
expectation message.

---

Nitpick comments:
In `@openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go`:
- Around line 405-416: Add a comment above the eventually function call in
expectClusterObjectSetCondition to document why the 12-minute timeout is used.
The comment should explain that this timeout is intentionally longer than other
assertions (3-5 minutes) to wait beyond the 10-minute progress deadline
configured in test case 88331, establishing the relationship between the timeout
value and the deadline requirement for maintainability.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2af580ca-f800-4db0-8859-d59b8a61f185

📥 Commits

Reviewing files that changed from the base of the PR and between ecd140b and 1d56a97.

📒 Files selected for processing (2)
  • openshift/tests-extension/.openshift-tests-extension/openshift_payload_olmv1.json
  • openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go

Comment on lines +263 to +264
output, err := oc.AsAdmin().WithoutNamespace().Run("start-build").Args(name, "-n", namespace, "--from-archive="+archive, "--wait").Output()
o.Expect(err).NotTo(o.HaveOccurred(), "failed to build image %s: %s", name, output)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Don't include large build output in error messages.

The output variable from oc start-build could contain extensive build logs (hundreds of lines). Including this directly in the o.Expect error message violates the QE guideline: "Don't put large log outputs in error messages (use proper log messages instead of o.Expect with large output)".

Consider logging the output separately with g.By() or truncating it:

♻️ Suggested fix
 output, err := oc.AsAdmin().WithoutNamespace().Run("start-build").Args(name, "-n", namespace, "--from-archive="+archive, "--wait").Output()
-o.Expect(err).NotTo(o.HaveOccurred(), "failed to build image %s: %s", name, output)
+if err != nil {
+    g.By(fmt.Sprintf("Build output for %s:\n%s", name, output))
+    o.Expect(err).NotTo(o.HaveOccurred(), "failed to build image %s", name)
+}

Or truncate the output:

 output, err := oc.AsAdmin().WithoutNamespace().Run("start-build").Args(name, "-n", namespace, "--from-archive="+archive, "--wait").Output()
-o.Expect(err).NotTo(o.HaveOccurred(), "failed to build image %s: %s", name, output)
+truncated := output
+if len(truncated) > 500 {
+    truncated = truncated[:500] + "... (truncated)"
+}
+o.Expect(err).NotTo(o.HaveOccurred(), "failed to build image %s: %s", name, truncated)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go` around
lines 263 - 264, The error message in the o.Expect assertion for the oc
start-build command includes the full output variable, which can contain
extensive build logs violating QE guidelines. Remove the output variable from
the o.Expect error message assertion at the line where start-build is run and
Run method is called, and instead log the output separately using g.By() before
the assertion or truncate the output to a reasonable size if the error needs to
include diagnostic information. This ensures large build logs are handled
through proper logging mechanisms rather than being embedded in the error
expectation message.

Source: Coding guidelines

@dtfranz

dtfranz commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

/retest

@openshift-ci

openshift-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

@dtfranz: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-upgrade-ovn-single-node 1d56a97 link false /test e2e-aws-upgrade-ovn-single-node

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants