Skip to content

Conversation

@vvereschaka
Copy link
Contributor

Specify LLVM_PTXAS_EXECUTABLE environment variable with a full path to 'ptxas' tool from installed NVIDIA CUDA Toolkit. LIT will be automatically check and use this tool to validate PTX during the tests.

@vvereschaka
Copy link
Contributor Author

@justinfargnoli thank you a lot for the proposed improvements for the NVPTX testing process.

All NVPTX build hosts have preinstalled NVIDIA CUDA Toolkit. The last one or the previous one. Currently they have been updated to 12.8. We update CUDA manually during the host maintenance.

…builders.

Specify LLVM_PTXAS_EXECUTABLE environment variable with a full path to
'ptxas' tool from installed NVIDIA CUDA Toolkit. LIT will be automatically
check and use this tool to validate PTX during the tests.
Copy link

@justinfargnoli justinfargnoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for putting up this PR, @vvereschaka! This will be a very useful improvement to the NVPTX tests.

LGTM!

@justinfargnoli
Copy link

CC @Artem-B

Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I'm a bit surprised that I've completely missed addition of those bots. It's good to know that we do have them and running ptxas tests would be very useful.

Another possible future improvement we could implement is to figure out a way to run ptxas tests with multiple CUDA versions. E.g. have LLVM_PTXAS_EXECUTABLE accept a list of ptxas locations, and then get lit to run %ptxas test cases per individual ptxas binary.

@justinfargnoli
Copy link

Another possible future improvement we could implement is to figure out a way to run ptxas tests with multiple CUDA versions. E.g. have LLVM_PTXAS_EXECUTABLE accept a list of ptxas locations, and then get lit to run %ptxas test cases per individual ptxas binary.

@vvereschaka, to this end, would it be okay to upload ptxas binaries to the BuiltBot to test compatibility with older CTKs?

@vvereschaka
Copy link
Contributor Author

vvereschaka commented Feb 24, 2025

@vvereschaka, to this end, would it be okay to upload ptxas binaries to the BuiltBot to test compatibility with older CTKs?

@justinfargnoli probably it is possible, we still keep some previous versions of CUDA on the Windows host, but not on the Linux host. I have noticed some problems with the multiple installations of CUDA on Linux and just remove the previous versions before upgrading to the last one.
Anyway, there are many questions how it will work actually. How many previous versions of CUDA is required for testing? Which versions - major only or all major and minor updates? How they will affect the test times? How all these binaries should be specified for the lit test? Will they automatically detected by Lit or we will need to specify them manually (and update the buildbot configuration every CUDA upgrade)? and so on.

@Artem-B
Copy link
Member

Artem-B commented Feb 24, 2025

I have noticed some problems with the multiple installations of CUDA on Linux

A common issue is that the build rules do not always specify the exact path to the CUDA version, and the build ends up using a mix of CUDA SDK bits picked up via explcitly provided path and via the parts found via PATH or /usr/local/cuda which may be from a different version.

Not having the default CUDA installation simplifies things, as the build will only pick up the CUDA it's explicitly pointed to, and will fail in a very obvious way otherwise.

Multiple CUDA versions can be installed safely from NVIDIA's .run installers. The only "gotcha" there is that by default the installer attempts to put some shared libraries into a system-wide directory. It can be worked around with --defaultroot= option passed to the installer.

Here's what I usually use.

$ export CUDA_ROOT=$HOME/local/cuda-x.y.z
$ bash ~/down/cuda_X.Y.Z....run --silent --defaultroot=$CUDA_ROOT --toolkit --toolkitpath=$CUDA_ROOT --tmpdir=$HOME/tmp 

--tmpdir=... is only needed if there's not much space in /tmp. That's where installer unpacks the files temporarily, and it needs a fair amount of space.

@justinfargnoli
Copy link

would it be okay to upload ptxas binaries to the BuiltBot to test compatibility with older CTKs?

To clarify, I think this should be a future improvement, not something we should block this PR on.


In light of this, is this PR ready to be merged or are there other blockers?

@vvereschaka
Copy link
Contributor Author

It should be ready. I'm going to merge this MR.

@vvereschaka vvereschaka merged commit c16b052 into llvm:main Feb 26, 2025
2 checks passed
@vvereschaka vvereschaka deleted the fix-nvptx-tests branch February 26, 2025 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants