Skip to content

Conversation

iankhou
Copy link
Contributor

@iankhou iankhou commented Oct 1, 2025

Problem

E2E tests are flaky due to unreliable network. Some retries that are implemented are not effective.

Recent intermittent E2E test failures:

Changes

Removed ineffective retries. Added retry for E2E Notices tests that have recently failed on ephemeral reasons.

Did not implement for deployment and sandbox tests, which require additional refactoring to retry (because they are implemented as GitHub Actions themselves, not scripts that can be retried).

Validation

Workflows running on this PR, with label run-e2e.

Checklist

  • If this PR includes a functional change to the runtime behavior of the code, I have added or updated automated test coverage for this change.
  • If this PR requires a change to the Project Architecture README, I have included that update in this PR.
  • If this PR requires a docs update, I have linked to that docs PR above.
  • If this PR modifies E2E tests, makes changes to resource provisioning, or makes SDK calls, I have run the PR checks with the run-e2e label set.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@iankhou iankhou self-assigned this Oct 1, 2025
@iankhou iankhou added the run-e2e Label that will include e2e tests in PR checks workflow label Oct 1, 2025
Copy link

changeset-bot bot commented Oct 1, 2025

⚠️ No Changeset found

Latest commit: ae73d15

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@iankhou iankhou force-pushed the iankhou-retry-fix branch from df3b17c to ae73d15 Compare October 1, 2025 14:50
@iankhou iankhou marked this pull request as ready for review October 1, 2025 16:53
@iankhou iankhou requested a review from a team as a code owner October 1, 2025 16:53
@iankhou iankhou requested a review from Copilot October 1, 2025 16:54
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes ineffective retry mechanisms in E2E health check workflows by removing non-functional retry steps and implementing a working retry for notices tests. The changes address flaky E2E test failures caused by unreliable network conditions.

  • Removed ineffective retry steps that weren't actually retrying the test execution
  • Added functional retry logic to E2E notices tests using nick-fields/retry action
  • Maintained existing test execution structure while improving reliability

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@iankhou
Copy link
Contributor Author

iankhou commented Oct 2, 2025

Will raise another PR that includes a fix at the action level.

@iankhou iankhou closed this Oct 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-e2e Label that will include e2e tests in PR checks workflow
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant