Skip to content

Conversation

@mryab
Copy link
Member

@mryab mryab commented Apr 19, 2025

Currently, test_averaging_cancel is one of the few flaky tests in our codebase: see e.g. https://github.com/learning-at-home/hivemind/actions/runs/14024261668/job/40777059821, where it failed 3 times out of 5.

Based on my investigation and discussion with @justheuristic, it looks like the problem is that it's very loosely defined: we start 4 averaging peers, cancel 2, and expect that the group size is 2 nonetheless. However, due to target_group_size=None by default, it's possible for the groups to have sizes of 3 and 1, which means that the len(control.result()) == 2 will occasionally fail.

To fix this issue, the PR introduces a separate test case with a fixed target_group_size. To make sure that peers don't average before cancellation, it also uses the averaging triggers to arrange the groups without starting the averaging itself

@codecov
Copy link

codecov bot commented Apr 19, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.45%. Comparing base (d20e810) to head (ba2e073).
Report is 24 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #650      +/-   ##
==========================================
+ Coverage   85.39%   85.45%   +0.05%     
==========================================
  Files          81       96      +15     
  Lines        8006     8575     +569     
==========================================
+ Hits         6837     7328     +491     
- Misses       1169     1247      +78     

see 35 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mryab mryab requested a review from justheuristic April 19, 2025 14:12
Copy link
Member

@justheuristic justheuristic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! / LGTM

@mryab mryab merged commit 5353328 into master Apr 19, 2025
28 of 31 checks passed
@mryab mryab deleted the fix-test-averaging-cancel branch April 19, 2025 14:44
mryab added a commit that referenced this pull request Apr 20, 2025
* Make test_averaging_cancel more robust

(cherry picked from commit 5353328)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants