-
-
Notifications
You must be signed in to change notification settings - Fork 11.4k
[CI] Fail subprocess tests with root-cause error #23795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a great quality-of-life improvement for debugging tests that run in forked processes. By serializing and re-raising exceptions from the child process, it ensures that the root-cause error is not lost, which will significantly speed up debugging.
I've found one critical bug related to operator precedence that prevents the exception from being correctly re-raised, and one high-severity issue concerning a resource leak where temporary files are not being cleaned up. Addressing these issues will make this a solid and robust enhancement.
Signed-off-by: Nick Hill <[email protected]>
5bc6669 to
00f167a
Compare
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
|
Retrying async tests to see if the timeout is just flaky or caused by this PR. |
Thanks @DarkLight1337. Looks like it hung again at the same place after ~10min. I'll cancel it so that it doesn't timeout after 3hrs again, will investigate tomorrow. |
Signed-off-by: Nick Hill <[email protected]>
# Conflicts: # tests/conftest.py
11214c2 to
1510ff0
Compare
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
1510ff0 to
a5b79e2
Compare
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: Nick Hill <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
Signed-off-by: Nick Hill <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
QOL improvement
We use a
create_new_process_for_each_testdecorator on many tests to avoid cleanup issues. However when such tests fail the pytest error is from the process failure not the actual test failure. This change propagates the actual test exception of interest so that it's reported by pytest as if the test was running inline.Before:

After:
