Skip to content

Conversation

FabianSchuetze
Copy link
Contributor

@FabianSchuetze FabianSchuetze commented Jul 5, 2025

Fixes #8673

@FabianSchuetze
Copy link
Contributor Author

One thing that I think this PR is still missing is adding a new test that would have failed in the current main branch, but I didn't find an avenue for that yet.

@abadams
Copy link
Member

abadams commented Jul 8, 2025

Maybe something in test/correctness/simd_op_check_x86.cpp ?

@FabianSchuetze
Copy link
Contributor Author

Thanks for the great comments. I added the test (failing under main) and realized I need to broaden the PR's scope a bit.

@abadams abadams requested a review from halidebuildbots July 9, 2025 16:53
@FabianSchuetze
Copy link
Contributor Author

@abadams: Thanks for requesting a review from the buildbots. Sorry for asking, but could you provide some guidance on what I should do next? I guess trying to figure out why some tests fail? I'm not yet really familiar with the CI here.

@abadams
Copy link
Member

abadams commented Jul 17, 2025

It looks like simd_op_check_x86 failed: https://buildbot.halide-lang.org/master/#/builders/145/builds/167/steps/12/logs/correctness_simd_op_check_x86

I don't see a clear error message though. See if you can reproduce that locally. If you can't, let me know and I'll take a look.

@FabianSchuetze
Copy link
Contributor Author

Thanks a lot for the comment!

I reproduced the error on an old Broadwell chip.

It's an illegal instruction error, so the vnni extension is falsely applied. I could, however, not fix the error yet. I'm a bit confused because AVX512 tests pass without possible hardware support. Do you have a guess for the error's root cause? Otherwise, I'll continue to search anyway.

@abadams
Copy link
Member

abadams commented Jul 18, 2025

That test only actually runs code if it detects that it can on the current host CPU. Otherwise it just checks that it compiles successfully. So maybe something is wrong with the host feature detection routine?

@FabianSchuetze
Copy link
Contributor Author

Thanks for the feedback, @abadams ! I have updated the code in the test's base class. The test now runs on the Broadwell machine I used and it seems the CI test passes too. Some test still fail though. I wonder if some failures are spurious, but I also wonder if my intervention caused problems for https://buildbot.halide-lang.org/master/#/builders/107/builds/218 .

@abadams
Copy link
Member

abadams commented Jul 24, 2025

It's possible that's a real error. The notable thing about the windows bot the failures are on is that it's our only Zen 4 bot. I'm not sure Zen 4 supports avx-vnni.

@alexreinking
Copy link
Member

It's possible that's a real error. The notable thing about the windows bot the failures are on is that it's our only Zen 4 bot. I'm not sure Zen 4 supports avx-vnni.

Phoronix says Zen 4 is supposed to support avx512_vnni.

https://www.phoronix.com/review/amd-zen4-avx512

@abadams
Copy link
Member

abadams commented Jul 24, 2025

avx512-vnni is a different extension to avx-vnni (and hilariously, avx-vnni came later)

@FabianSchuetze
Copy link
Contributor Author

Thanks for the helpful comments! I'll try to reproduce that error.

I've two questions in that regard. First, how do you identify the CPU arch of a runner? I looked at the run's output, but couldn't infer that. Second, was there also a Zen4 Linux run? If not, I can try to get hold of a Zen4 Linux machine, hoping the failure is arch and not OS specific.

@alexreinking
Copy link
Member

avx512-vnni is a different extension to avx-vnni (and hilariously, avx-vnni came later)

Endlessly confusing omg

t.set_feature(Target::AVXVNNI);
}
if (t.has_feature(Target::AVX512_Zen4)) {
t.set_feature(Target::AVXVNNI);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try removing this line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No Generation of VNNI instructions on avx2-avxvnni Targets
3 participants