feat: make (fixed-width) evaluation return a non-zero exitcode on mismatch#1365
feat: make (fixed-width) evaluation return a non-zero exitcode on mismatch#1365alexkeizer wants to merge 2 commits intomainfrom
Conversation
…match This PR changes the evaluation script to return a non-zero exitcode whenever the ripgrep checks find a mismatched number (i.e., whenever it currently prints FAIL to the log). This exitcode should cause the CI to fail, meaning we no longer have to manually check the logs to confirm the evaluation actually got the right numbers. It does mean that now whenever we do expect the numbers to change, we have to change the expected numbers in the script before CI can pass, but forcing us to keep these numbers in sync with reality seems more like a feature than a bug.
|
Alive Statistics: 90 / 93 (3 failed) |
|
This PR is in principle ready for review: I've drafted it because I felt like I should test it properly on CI, to confirm it does actually fail on mismatch, but CI seems to not work at all atm |
|
Alive Statistics: 90 / 93 (3 failed) |
|
bitwuzla and leanSAT provided counterexample for theorem 3 in file gapinthcast_proof.lean |
This PR changes the evaluation script to return a non-zero exitcode whenever the ripgrep checks find a mismatched number (i.e., whenever it currently prints FAIL to the log). This exitcode should cause the CI to fail, meaning we no longer have to manually check the logs to confirm the evaluation actually got the right numbers.
It does mean that now whenever we do expect the numbers to change, we have to change the expected numbers in the script before CI can pass, but forcing us to keep these numbers in sync with reality seems more like a feature than a bug.