-
-
Notifications
You must be signed in to change notification settings - Fork 14.2k
Add checks for gpu-kernel calling conv #149991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
rustbot has assigned @WaffleLapkin. Use |
b6cb0c2 to
0a9373c
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
0a9373c to
1e9b1dc
Compare
This comment has been minimized.
This comment has been minimized.
1e9b1dc to
10b32d6
Compare
This comment has been minimized.
This comment has been minimized.
The `gpu-kernel` calling convention has several restrictions that were not enforced by the compiler until now. Add the following restrictions: 1. Cannot be async 2. Cannot be called 3. Cannot return values, return type must be `()` or `!` 4. Arguments should be simple, i.e. passed by value. More complicated types can work when you know what you are doing, but it is rather unintuitive, one needs to know ABI/compiler internals. 5. Export name should be unmangled, either through `no_mangle` or `export_name`. Kernels are searched by name on the CPU side, having a mangled name makes it hard to find and probably almost always unintentional.
10b32d6 to
88ad16c
Compare
|
r? workingjubilee As I'm completely missing context |
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AST-level code looks good.
Some details on messaging here. I'm not committed to a precise message on these, which is why it's a bit "multiple choice" here, just wondering if these could be improved. In one or two cases it is a must-change.
Should we be enforcing a maximum number of arguments, also? Probably not if there's no cross-driver consensus on that, but maybe?
| functional record update syntax requires a struct | ||
| hir_typeck_gpu_kernel_abi_cannot_be_called = | ||
| functions with the "gpu-kernel" ABI cannot be called |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we be more specific?
| functions with the "gpu-kernel" ABI cannot be called | |
| functions with the "gpu-kernel" ABI cannot be called from Rust |
| functions with the "gpu-kernel" ABI cannot be called | |
| functions with the "gpu-kernel" ABI cannot be called, even in device code |
| hir_typeck_gpu_kernel_abi_cannot_be_called = | ||
| functions with the "gpu-kernel" ABI cannot be called | ||
| .note = an `extern "gpu-kernel"` function can only be launched on the GPU through an API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure "can" is right? Maybe "should"? And "an API" is somewhat vague, what kind of API? Surely not a standard library one, for instance...
| .note = an `extern "gpu-kernel"` function can only be launched on the GPU through an API | |
| .note = an `extern "gpu-kernel"` function should only be launched on the GPU through a driver's API |
Or "must"?
...I don't know if "kernel loader" is a real term for the thing that the GPU driver "does", it just came up intuitively while trying to grasp for words for the thing.
| .note = an `extern "gpu-kernel"` function can only be launched on the GPU through an API | |
| .note = an `extern "gpu-kernel"` function must only be launched on the GPU by the kernel loader |
| declare_lint! { | ||
| /// The `missing_gpu_kernel_export_name` lint detects `gpu-kernel` functions that have a mangled name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we can't simply use an unmangled name by default because that would be unsafe, right.
|
|
||
| /// `ImproperGpuKernelLint` checks `gpu-kernel` function definitions: | ||
| /// | ||
| /// - `extern "gpu-kernel" fn` arguments should be simple. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid "simple", even in a concise format like this: we mean "primitive types".
| /// - `extern "gpu-kernel" fn` arguments should be simple. | |
| /// - `extern "gpu-kernel" fn` arguments should be primitive types. |
| /// This lint is issued when it detects a probable mistake in a signature. | ||
| IMPROPER_GPU_KERNEL_ARG, | ||
| Warn, | ||
| "simple arguments of gpu-kernel functions" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should capture the reason for the lint, not what it is checking for, so something like
| "simple arguments of gpu-kernel functions" | |
| "GPU kernel entry points have a limited calling convention" |
|
Reminder, once the PR becomes ready for a review, use |
The
gpu-kernelcalling convention has several restrictions that were not enforced by the compiler until now.Add the following restrictions:
()or!no_mangleorexport_name. Kernels are searched by name on the CPU side, having a mangled name makes it hard to find and probably almost always unintentional.Tracking issue: #135467
amdgpu target tracking issue: #135024
@workingjubilee, these should be all the restrictions we talked about a year ago.
cc @RDambrosio016 @kjetilkjeka for nvptx