NO-JIRA: e2e: improve failure diagnostics by theobarberbany · Pull Request #484 · openshift/cluster-capi-operator

theobarberbany · 2026-03-02T17:30:45Z

Waging more war on flakes in our migration e2es.

Wire trackResource into create helpers so tracked resources (and their sync controller mirrors) are dumped on failure. Add namespace-wide event dumps for both CAPI and MAPI namespaces, and list all AWSMachineTemplates regardless of name. All dump functions are best-effort with panic recovery.

Remove --junit-report from hack/test.sh for e2e runs since the custom ReportAfterSuite handles JUnit generation with diagnostics inlined into the failure element for Spyglass.

Currently we've got a dump that dumps everything (see below), including things not related to our tests (and that are covered by must-gathers) this tries to scope that down, and make the dump more useful (yaml rather than lists, scoped to the failure)

Machine machine-auth-capi-58g4d should reach Running state
Expected success, but got an error:
    <*errors.errorString | 0xc0014051d0>: 
    CAPI Machine machine-auth-capi-58g4d: phase "Pending", want Running (conditions: [Available=False, Ready=False, BootstrapConfigReady=True, InfrastructureReady=False, NodeHealthy=Unknown, NodeReady=Unknown, Paused=False, Deleting=False])
    {
        s: "CAPI Machine machine-auth-capi-58g4d: phase \"Pending\", want Running (conditions: [Available=False, Ready=False, BootstrapConfigReady=True, InfrastructureReady=False, NodeHealthy=Unknown, NodeReady=Unknown, Paused=False, Deleting=False])",
    }
In [BeforeAll] at: /go/src/github.com/openshift/cluster-capi-operator/e2e/machine_migration_capi_authoritative.go:53 @ 03/02/26 09:32:00.957
< Exit [BeforeAll] with spec.authoritativeAPI: ClusterAPI and already existing CAPI Machine with same name - /go/src/github.com/openshift/cluster-capi-operator/e2e/machine_migration_capi_authoritative.go:51 @ 03/02/26 09:32:00.958 (15m0.272s)
> Enter [ReportAfterEach] TOP-LEVEL - /go/src/github.com/openshift/cluster-capi-operator/e2e/e2e_test.go:33 @ 03/02/26 09:32:00.958

=== Cluster State Dump (test failure) ===

[openshift-machine-api] MAPI Machines (6):
  ci-op-w9i1vbs0-c3c99-ndphw-master-0                phase=Running      authAPI=MachineAPI   conditions=[Drainable=False, InstanceExists=True, Paused=False, Synchronized=False, Terminable=True] created=2026-03-02T07:39:54Z
  ci-op-w9i1vbs0-c3c99-ndphw-master-1                phase=Running      authAPI=MachineAPI   conditions=[Drainable=False, InstanceExists=True, Paused=False, Synchronized=False, Terminable=True] created=2026-03-02T07:39:55Z
  ci-op-w9i1vbs0-c3c99-ndphw-master-2                phase=Running      authAPI=MachineAPI   conditions=[Drainable=False, InstanceExists=True, Paused=False, Synchronized=False, Terminable=True] created=2026-03-02T07:39:55Z
  ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-28tmd phase=Running      authAPI=MachineAPI   conditions=[Drainable=True, InstanceExists=True, Paused=False, Synchronized=True, Terminable=True] created=2026-03-02T07:48:49Z
  ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-ntmvl phase=Running      authAPI=MachineAPI   conditions=[Drainable=True, InstanceExists=True, Paused=False, Synchronized=True, Terminable=True] created=2026-03-02T07:48:49Z
  ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-xc2cs phase=Running      authAPI=MachineAPI   conditions=[Drainable=True, InstanceExists=True, Paused=False, Synchronized=True, Terminable=True] created=2026-03-02T07:48:49Z

[openshift-machine-api] MAPI MachineSets (1):
  ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b       replicas=3/3 authAPI=MachineAPI   conditions=[Paused=False, Synchronized=True]

[openshift-cluster-api] CAPI Machines (7):
  ci-op-w9i1vbs0-c3c99-ndphw-master-0                phase=Running      conditions=[Available=True, Ready=True, BootstrapConfigReady=True, InfrastructureReady=True, NodeReady=True, Paused=True, Deleting=False] created=2026-03-02T08:04:23Z
  ci-op-w9i1vbs0-c3c99-ndphw-master-1                phase=Provisioned  conditions=[Available=False, Ready=False, BootstrapConfigReady=False, InfrastructureReady=True, NodeReady=False, Paused=True, Deleting=False] created=2026-03-02T08:04:23Z
  ci-op-w9i1vbs0-c3c99-ndphw-master-2                phase=Provisioned  conditions=[Available=False, Ready=False, BootstrapConfigReady=False, InfrastructureReady=True, NodeReady=False, Paused=True, Deleting=False] created=2026-03-02T08:04:23Z
  ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-28tmd phase=Running      conditions=[Available=True, Ready=True, BootstrapConfigReady=True, InfrastructureReady=True, NodeReady=True, Paused=True, Deleting=False] created=2026-03-02T08:04:23Z
  ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-ntmvl phase=Running      conditions=[Available=True, Ready=True, BootstrapConfigReady=True, InfrastructureReady=True, NodeReady=True, Paused=True, Deleting=False] created=2026-03-02T08:04:23Z
  ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-xc2cs phase=Running      conditions=[Available=True, Ready=True, BootstrapConfigReady=True, InfrastructureReady=True, NodeReady=True, Paused=True, Deleting=False] created=2026-03-02T08:04:24Z
  machine-auth-capi-58g4d                            phase=Pending      conditions=[Available=False, Ready=False, BootstrapConfigReady=True, InfrastructureReady=False, NodeHealthy=Unknown, NodeReady=Unknown, Paused=False, Deleting=False] created=2026-03-02T09:17:00Z

[openshift-cluster-api] CAPI MachineSets (1):
  ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b       replicas=3/3 conditions=[Paused=True]

[openshift-cluster-api] Events (last 10min, 1):
  2026-03-02T09:31:33Z AWSMachine/machine-auth-capi-58g4d Warning  FailedGetBootstrapData failed to retrieve bootstrap data secret for AWSMachine openshift-cluster-api/machine-auth-capi-58g4d: Secret "master...

[openshift-cluster-api] AWSMachines (7):
  ci-op-w9i1vbs0-c3c99-ndphw-master-0                instanceType=m6a.xlarge   instanceID=i-0a4ae174bf5ffb56d    providerID=aws:///us-west-2b/i-0a4ae174bf5ffb56d created=2026-03-02T08:04:24Z
  ci-op-w9i1vbs0-c3c99-ndphw-master-1                instanceType=m6a.xlarge   instanceID=i-072a247547d6d5fb9    providerID=aws:///us-west-2b/i-072a247547d6d5fb9 created=2026-03-02T08:04:25Z
  ci-op-w9i1vbs0-c3c99-ndphw-master-2                instanceType=m6a.xlarge   instanceID=i-0a7864fcf10f6ac81    providerID=aws:///us-west-2b/i-0a7864fcf10f6ac81 created=2026-03-02T08:04:25Z
  ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-28tmd instanceType=m6a.xlarge   instanceID=i-022e5f35e41bc83f8    providerID=aws:///us-west-2b/i-022e5f35e41bc83f8 created=2026-03-02T08:04:35Z
  ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-ntmvl instanceType=m6a.xlarge   instanceID=i-041ee8ba909566c95    providerID=aws:///us-west-2b/i-041ee8ba909566c95 created=2026-03-02T08:04:35Z
  ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-xc2cs instanceType=m6a.xlarge   instanceID=i-01c729cf9ab4c983b    providerID=aws:///us-west-2b/i-01c729cf9ab4c983b created=2026-03-02T08:04:36Z
  machine-auth-capi-58g4d                            instanceType=m6a.xlarge   instanceID=                       providerID= created=2026-03-02T09:17:00Z

[openshift-cluster-api] AWSMachineTemplates (1):
  ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-2e687e26 instanceType=m6a.xlarge   created=2026-03-02T08:04:03Z
=== End Cluster State Dump ===

Summary by CodeRabbit

Release Notes

New Features
- Added resource tracking for e2e tests to monitor resources created during test execution
- Introduced JUnit XML report generation for test suite results with captured diagnostic information
Tests
- Enhanced diagnostic output with targeted, YAML-based formatting and improved error reporting
- Improved error handling in diagnostics with panic recovery to prevent test failures during reporting
- Diagnostics now focus on tracked resources rather than full cluster state

openshift-ci-robot · 2026-03-02T17:30:50Z

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

coderabbitai · 2026-03-02T17:30:53Z

📝 Walkthrough

Walkthrough

The changes introduce a resource-tracking system for e2e tests that monitors created resources and replaces cluster-wide state dumps with targeted per-resource diagnostics in YAML format. JUnit report generation is moved from the test script to the test suite and triggers only on test failures.

Changes

Cohort / File(s)	Summary
Resource Tracking Infrastructure `e2e/e2e_common.go`, `e2e/e2e_test.go`	Introduces resourcesUnderTest collection and trackResource helper for registering resources. Replaces cluster-wide dumpClusterState with per-resource diagnostics via dumpTrackedResources and dumpSingleResource. Adds dumpAllAWSMachineTemplates and dumpNamespaceEvents with panic recovery. Generates JUnit XML reports in ReportAfterSuite hook when tests fail, embedding diagnostic output as failure descriptions.
Resource Tracking Integration `e2e/machine_migration_helpers.go`, `e2e/machineset_migration_helpers.go`	Integrates trackResource calls throughout resource creation workflows to register CAPI/MAPI machines, machine sets, and AWS machine templates for diagnostic visibility and tracking.
CI Test Script `hack/test.sh`	Removes automatic junit-report generation from base GINKGO_ARGS and makes it conditional: injected only for non-e2e test directories, allowing the e2e test suite to control its own JUnit report generation.

Sequence Diagram

sequenceDiagram
    participant Test as E2E Test
    participant Tracker as Resource Tracker
    participant K8s as Kubernetes API
    participant Diag as Diagnostics Engine
    participant JUnit as JUnit Reporter
    
    Test->>Tracker: trackResource(obj)
    activate Tracker
    Tracker->>Tracker: register in resourcesUnderTest
    deactivate Tracker
    
    Test->>Test: run assertions
    alt Test Fails
        Test->>Diag: dumpTrackedResources()
        activate Diag
        loop for each tracked resource
            Diag->>K8s: fetch live object
            K8s-->>Diag: object + events
            Diag->>Diag: marshal to YAML
            Diag->>Diag: render with cleanups
        end
        Diag->>Diag: capture output
        deactivate Diag
        
        Test->>JUnit: ReportAfterSuite hook
        activate JUnit
        JUnit->>JUnit: read ARTIFACT_DIR
        JUnit->>JUnit: generate junit_cluster_capi_operator.xml
        JUnit->>JUnit: append diagnostic output to failure message
        JUnit->>JUnit: write XML report
        deactivate JUnit
    else Test Passes
        Test->>Tracker: reset resourcesUnderTest
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A resourceful hop through test diagnostics fair,
We track each machine with diagnostic care,
YAML flows gentle, events laid bare,
Failures now spoken with clarity rare,
JUnit reports hop from tests to the air! 🎯

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main objective of the PR: improving failure diagnostics in e2e tests by implementing focused, resource-tracked diagnostic dumps instead of cluster-wide state dumps.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2026-03-02T17:30:54Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

openshift-ci-robot · 2026-03-02T17:49:39Z

@theobarberbany: This pull request explicitly references no jira issue.

Details

In response to this:

Wire trackResource into create helpers so tracked resources (and their sync controller mirrors) are dumped on failure. Add namespace-wide event dumps for both CAPI and MAPI namespaces, and list all AWSMachineTemplates regardless of name. All dump functions are best-effort with panic recovery.

Remove --junit-report from hack/test.sh for e2e runs since the custom ReportAfterSuite handles JUnit generation with diagnostics inlined into the failure element for Spyglass.

Currently we've got a dump that dumps everything, including things not related to our tests (and that are covered by must-gathers) this tries to scope that down, and make the dump more useful (describes rather than lists)

Machine machine-auth-capi-58g4d should reach Running state
Expected success, but got an error:
   <*errors.errorString | 0xc0014051d0>: 
   CAPI Machine machine-auth-capi-58g4d: phase "Pending", want Running (conditions: [Available=False, Ready=False, BootstrapConfigReady=True, InfrastructureReady=False, NodeHealthy=Unknown, NodeReady=Unknown, Paused=False, Deleting=False])
   {
       s: "CAPI Machine machine-auth-capi-58g4d: phase \"Pending\", want Running (conditions: [Available=False, Ready=False, BootstrapConfigReady=True, InfrastructureReady=False, NodeHealthy=Unknown, NodeReady=Unknown, Paused=False, Deleting=False])",
   }
In [BeforeAll] at: /go/src/github.com/openshift/cluster-capi-operator/e2e/machine_migration_capi_authoritative.go:53 @ 03/02/26 09:32:00.957
< Exit [BeforeAll] with spec.authoritativeAPI: ClusterAPI and already existing CAPI Machine with same name - /go/src/github.com/openshift/cluster-capi-operator/e2e/machine_migration_capi_authoritative.go:51 @ 03/02/26 09:32:00.958 (15m0.272s)
> Enter [ReportAfterEach] TOP-LEVEL - /go/src/github.com/openshift/cluster-capi-operator/e2e/e2e_test.go:33 @ 03/02/26 09:32:00.958

=== Cluster State Dump (test failure) ===

[openshift-machine-api] MAPI Machines (6):
 ci-op-w9i1vbs0-c3c99-ndphw-master-0                phase=Running      authAPI=MachineAPI   conditions=[Drainable=False, InstanceExists=True, Paused=False, Synchronized=False, Terminable=True] created=2026-03-02T07:39:54Z
 ci-op-w9i1vbs0-c3c99-ndphw-master-1                phase=Running      authAPI=MachineAPI   conditions=[Drainable=False, InstanceExists=True, Paused=False, Synchronized=False, Terminable=True] created=2026-03-02T07:39:55Z
 ci-op-w9i1vbs0-c3c99-ndphw-master-2                phase=Running      authAPI=MachineAPI   conditions=[Drainable=False, InstanceExists=True, Paused=False, Synchronized=False, Terminable=True] created=2026-03-02T07:39:55Z
 ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-28tmd phase=Running      authAPI=MachineAPI   conditions=[Drainable=True, InstanceExists=True, Paused=False, Synchronized=True, Terminable=True] created=2026-03-02T07:48:49Z
 ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-ntmvl phase=Running      authAPI=MachineAPI   conditions=[Drainable=True, InstanceExists=True, Paused=False, Synchronized=True, Terminable=True] created=2026-03-02T07:48:49Z
 ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-xc2cs phase=Running      authAPI=MachineAPI   conditions=[Drainable=True, InstanceExists=True, Paused=False, Synchronized=True, Terminable=True] created=2026-03-02T07:48:49Z

[openshift-machine-api] MAPI MachineSets (1):
 ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b       replicas=3/3 authAPI=MachineAPI   conditions=[Paused=False, Synchronized=True]

[openshift-cluster-api] CAPI Machines (7):
 ci-op-w9i1vbs0-c3c99-ndphw-master-0                phase=Running      conditions=[Available=True, Ready=True, BootstrapConfigReady=True, InfrastructureReady=True, NodeReady=True, Paused=True, Deleting=False] created=2026-03-02T08:04:23Z
 ci-op-w9i1vbs0-c3c99-ndphw-master-1                phase=Provisioned  conditions=[Available=False, Ready=False, BootstrapConfigReady=False, InfrastructureReady=True, NodeReady=False, Paused=True, Deleting=False] created=2026-03-02T08:04:23Z
 ci-op-w9i1vbs0-c3c99-ndphw-master-2                phase=Provisioned  conditions=[Available=False, Ready=False, BootstrapConfigReady=False, InfrastructureReady=True, NodeReady=False, Paused=True, Deleting=False] created=2026-03-02T08:04:23Z
 ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-28tmd phase=Running      conditions=[Available=True, Ready=True, BootstrapConfigReady=True, InfrastructureReady=True, NodeReady=True, Paused=True, Deleting=False] created=2026-03-02T08:04:23Z
 ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-ntmvl phase=Running      conditions=[Available=True, Ready=True, BootstrapConfigReady=True, InfrastructureReady=True, NodeReady=True, Paused=True, Deleting=False] created=2026-03-02T08:04:23Z
 ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-xc2cs phase=Running      conditions=[Available=True, Ready=True, BootstrapConfigReady=True, InfrastructureReady=True, NodeReady=True, Paused=True, Deleting=False] created=2026-03-02T08:04:24Z
 machine-auth-capi-58g4d                            phase=Pending      conditions=[Available=False, Ready=False, BootstrapConfigReady=True, InfrastructureReady=False, NodeHealthy=Unknown, NodeReady=Unknown, Paused=False, Deleting=False] created=2026-03-02T09:17:00Z

[openshift-cluster-api] CAPI MachineSets (1):
 ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b       replicas=3/3 conditions=[Paused=True]

[openshift-cluster-api] Events (last 10min, 1):
 2026-03-02T09:31:33Z AWSMachine/machine-auth-capi-58g4d Warning  FailedGetBootstrapData failed to retrieve bootstrap data secret for AWSMachine openshift-cluster-api/machine-auth-capi-58g4d: Secret "master...

[openshift-cluster-api] AWSMachines (7):
 ci-op-w9i1vbs0-c3c99-ndphw-master-0                instanceType=m6a.xlarge   instanceID=i-0a4ae174bf5ffb56d    providerID=aws:///us-west-2b/i-0a4ae174bf5ffb56d created=2026-03-02T08:04:24Z
 ci-op-w9i1vbs0-c3c99-ndphw-master-1                instanceType=m6a.xlarge   instanceID=i-072a247547d6d5fb9    providerID=aws:///us-west-2b/i-072a247547d6d5fb9 created=2026-03-02T08:04:25Z
 ci-op-w9i1vbs0-c3c99-ndphw-master-2                instanceType=m6a.xlarge   instanceID=i-0a7864fcf10f6ac81    providerID=aws:///us-west-2b/i-0a7864fcf10f6ac81 created=2026-03-02T08:04:25Z
 ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-28tmd instanceType=m6a.xlarge   instanceID=i-022e5f35e41bc83f8    providerID=aws:///us-west-2b/i-022e5f35e41bc83f8 created=2026-03-02T08:04:35Z
 ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-ntmvl instanceType=m6a.xlarge   instanceID=i-041ee8ba909566c95    providerID=aws:///us-west-2b/i-041ee8ba909566c95 created=2026-03-02T08:04:35Z
 ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-xc2cs instanceType=m6a.xlarge   instanceID=i-01c729cf9ab4c983b    providerID=aws:///us-west-2b/i-01c729cf9ab4c983b created=2026-03-02T08:04:36Z
 machine-auth-capi-58g4d                            instanceType=m6a.xlarge   instanceID=                       providerID= created=2026-03-02T09:17:00Z

[openshift-cluster-api] AWSMachineTemplates (1):
 ci-op-w9i1vbs0-c3c99-ndphw-worker-us-west-2b-2e687e26 instanceType=m6a.xlarge   created=2026-03-02T08:04:03Z
=== End Cluster State Dump ===

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

… events Wire trackResource into create helpers so tracked resources (and their sync controller mirrors) are dumped on failure. Add namespace-wide event dumps for both CAPI and MAPI namespaces, and list all AWSMachineTemplates regardless of name. All dump functions are best-effort with panic recovery. Remove --junit-report from hack/test.sh for e2e runs since the custom ReportAfterSuite handles JUnit generation with diagnostics inlined into the failure element for Spyglass.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

hack/test.sh (1)

37-39: Use token-based e2e detection for JUnit gating.

*"e2e"* can match unintended directory names and skip JUnit for non-e2e runs. Prefer checking TEST_DIRS entries as path tokens.

Proposed patch

-  if [[ "${TEST_DIRS}" != *"e2e"* ]]; then
+  has_e2e=false
+  for d in ${TEST_DIRS}; do
+    if [[ "${d}" == "e2e" || "${d}" == "./e2e" || "${d}" == e2e/* || "${d}" == ./e2e/* ]]; then
+      has_e2e=true
+      break
+    fi
+  done
+  if [[ "${has_e2e}" == "false" ]]; then
     GINKGO_ARGS="${GINKGO_ARGS} --junit-report=junit_cluster_capi_operator.xml"
   fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@hack/test.sh` around lines 37 - 39, The current glob check [[ "${TEST_DIRS}"
!= *"e2e"* ]] can false-match substrings; change the condition to test tokens
instead and only skip adding the JUnit flag when no path token equals "e2e". For
example, replace that condition with a tokenized check that searches TEST_DIRS
for an exact token (e.g., split on whitespace/commas or use: if ! echo
"${TEST_DIRS}" | tr ' ,;' '\n' | grep -xq "e2e"; then ... fi) and keep the
GINKGO_ARGS="${GINKGO_ARGS} --junit-report=..." assignment inside the updated if
block so GINKGO_ARGS is only modified when there is no exact "e2e" token.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@e2e/e2e_common.go`:
- Around line 172-173: The event collection currently matches events only by
name (call sites invoking describeObjectEvents(buf, key)), which mixes events
from different kinds (e.g., Machine vs AWSMachine); update the logic to scope
events to the involved object kind/UID/namespace. Change describeObjectEvents
(or add a new helper like describeObjectEventsForObject) to accept the object's
Kind and UID (or full involvedObject reference) and filter Kubernetes events by
involvedObject.kind and involvedObject.uid (and namespace/name) rather than name
alone; update all call sites (the occurrences around the shown call and the
other noted ranges 193-207, 257-258) to pass the object's kind/UID so only
events for that exact object are listed.

---

Nitpick comments:
In `@hack/test.sh`:
- Around line 37-39: The current glob check [[ "${TEST_DIRS}" != *"e2e"* ]] can
false-match substrings; change the condition to test tokens instead and only
skip adding the JUnit flag when no path token equals "e2e". For example, replace
that condition with a tokenized check that searches TEST_DIRS for an exact token
(e.g., split on whitespace/commas or use: if ! echo "${TEST_DIRS}" | tr ' ,;'
'\n' | grep -xq "e2e"; then ... fi) and keep the GINKGO_ARGS="${GINKGO_ARGS}
--junit-report=..." assignment inside the updated if block so GINKGO_ARGS is
only modified when there is no exact "e2e" token.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between f21137a and 4041131.

📒 Files selected for processing (5)

e2e/e2e_common.go
e2e/e2e_test.go
e2e/machine_migration_helpers.go
e2e/machineset_migration_helpers.go
hack/test.sh

e2e/e2e_common.go

theobarberbany · 2026-03-02T18:12:19Z

/test e2e-aws-capi-techpreview

JoelSpeed

/lgtm
/approve

openshift-ci-robot · 2026-03-02T18:30:20Z

Tests from second stage were triggered manually. Pipeline can be controlled only manually, until HEAD changes. Use command to trigger second stage.

openshift-ci · 2026-03-02T18:30:21Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JoelSpeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [JoelSpeed]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

nrb

Overall it makes sense to me, but one question.

e2e/e2e_common.go

theobarberbany · 2026-03-02T22:19:16Z

/pipeline required

openshift-ci-robot · 2026-03-02T22:19:18Z

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aws-capi-techpreview
/test e2e-aws-ovn
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-aws-ovn-techpreview
/test e2e-aws-ovn-techpreview-upgrade
/test e2e-azure-capi-techpreview
/test e2e-azure-ovn-techpreview
/test e2e-azure-ovn-techpreview-upgrade
/test e2e-gcp-capi-techpreview
/test e2e-gcp-ovn-techpreview
/test e2e-metal3-capi-techpreview
/test e2e-openstack-capi-techpreview
/test e2e-openstack-ovn-techpreview
/test e2e-vsphere-capi-techpreview
/test regression-clusterinfra-aws-ipi-techpreview-capi

sunzhaohua2 · 2026-03-03T02:05:26Z

/test e2e-aws-ovn-techpreview

theobarberbany · 2026-03-03T08:26:39Z

/retest

theobarberbany · 2026-03-03T10:35:22Z

ci looking pretty borked :( If we're still failing im inclined to override given we have good signal on the ci/prow/e2e-aws-capi-techpreview job where this is most used.

theobarberbany · 2026-03-03T10:35:28Z

/retest

theobarberbany · 2026-03-03T10:56:03Z

/override ci/prow/e2e-aws-ovn-techpreview
/override ci/prow/e2e-openstack-ovn-techpreview
/override ci/prow/e2e-openstack-capi-techpreview
/override ci/prow/e2e-gcp-ovn-techpreview

theobarberbany · 2026-03-03T10:56:28Z

/verified by ci/prow/e2e-aws-capi-techpreview

openshift-ci-robot · 2026-03-03T10:56:39Z

@theobarberbany: This PR has been marked as verified by [ci/prow/e2e-aws-capi-techpreview](https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_cluster-capi-operator/484/pull-ci-openshift-cluster-capi-operator-main-e2e-aws-capi-techpreview/2028596049530064896).

Details

In response to this:

/verified by ci/prow/e2e-aws-capi-techpreview

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2026-03-03T10:57:45Z

@theobarberbany: Overrode contexts on behalf of theobarberbany: ci/prow/e2e-aws-ovn-techpreview, ci/prow/e2e-gcp-ovn-techpreview, ci/prow/e2e-openstack-capi-techpreview, ci/prow/e2e-openstack-ovn-techpreview

Details

In response to this:

/override ci/prow/e2e-aws-ovn-techpreview
/override ci/prow/e2e-openstack-ovn-techpreview
/override ci/prow/e2e-openstack-capi-techpreview
/override ci/prow/e2e-gcp-ovn-techpreview

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

theobarberbany · 2026-03-03T11:54:54Z

/retest

theobarberbany · 2026-03-03T16:03:07Z

/test images

theobarberbany · 2026-03-03T16:51:13Z

/override ci/prow/images

openshift-ci · 2026-03-03T16:52:12Z

@theobarberbany: Overrode contexts on behalf of theobarberbany: ci/prow/images

Details

In response to this:

/override ci/prow/images

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci · 2026-03-03T16:52:15Z

@theobarberbany: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 2, 2026

theobarberbany force-pushed the tb/fix-cluster-state-dump branch 3 times, most recently from d3d2b71 to cc19ef4 Compare March 2, 2026 17:47

theobarberbany changed the title ~~e2e: improve failure diagnostics~~ NO-JIRA: e2e: improve failure diagnostics Mar 2, 2026

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 2, 2026

theobarberbany force-pushed the tb/fix-cluster-state-dump branch from cc19ef4 to 4041131 Compare March 2, 2026 18:00

theobarberbany marked this pull request as ready for review March 2, 2026 18:00

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 2, 2026

openshift-ci bot requested review from chrischdi and mdbooth March 2, 2026 18:01

coderabbitai bot reviewed Mar 2, 2026

View reviewed changes

e2e/e2e_common.go Show resolved Hide resolved

JoelSpeed reviewed Mar 2, 2026

View reviewed changes

openshift-ci bot assigned JoelSpeed Mar 2, 2026

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 2, 2026

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 2, 2026

nrb reviewed Mar 2, 2026

View reviewed changes

e2e/e2e_common.go Show resolved Hide resolved

openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Mar 3, 2026

openshift-merge-bot bot merged commit 05e0ebe into openshift:main Mar 3, 2026
25 checks passed

theobarberbany deleted the tb/fix-cluster-state-dump branch March 3, 2026 17:09

Conversation

theobarberbany commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

openshift-ci-robot commented Mar 2, 2026

Uh oh!

coderabbitai bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

openshift-ci bot commented Mar 2, 2026

Uh oh!

openshift-ci-robot commented Mar 2, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

theobarberbany commented Mar 2, 2026

Uh oh!

JoelSpeed left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci-robot commented Mar 2, 2026

Uh oh!

openshift-ci bot commented Mar 2, 2026

Uh oh!

nrb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

theobarberbany commented Mar 2, 2026

Uh oh!

openshift-ci-robot commented Mar 2, 2026

Uh oh!

sunzhaohua2 commented Mar 3, 2026

Uh oh!

theobarberbany commented Mar 3, 2026

Uh oh!

theobarberbany commented Mar 3, 2026

Uh oh!

theobarberbany commented Mar 3, 2026

Uh oh!

theobarberbany commented Mar 3, 2026

Uh oh!

theobarberbany commented Mar 3, 2026

Uh oh!

openshift-ci-robot commented Mar 3, 2026

Uh oh!

openshift-ci bot commented Mar 3, 2026

Uh oh!

theobarberbany commented Mar 3, 2026

Uh oh!

theobarberbany commented Mar 3, 2026

Uh oh!

theobarberbany commented Mar 3, 2026

Uh oh!

openshift-ci bot commented Mar 3, 2026

Uh oh!

openshift-ci bot commented Mar 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

theobarberbany commented Mar 2, 2026 •

edited

Loading

coderabbitai bot commented Mar 2, 2026 •

edited

Loading