Why would type annotation affect the performance of matmul kernel? #8457
Unanswered
Fr4nk1inCs
asked this question in
Q&A
Replies: 1 comment
-
Config selected for each input shape: Autotune config
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Following the official tutorial on Matrix Multiplicaiton, I've written a similar
matmul
operator:AUTOTUNE_CONFIG
The
_inner_kernel
is the actual implementation of matmul, andmatmul_kernel_with_annotation
/matmul_kernel_without_annotation
are wrappers of_inner_kernel
. The only difference is thatmatmul_kernel_with_annotation
has annotations for non-constexpr arguments.However, when I tried to benchmarking both version of matmul kernel with the following code, it seems that the annotated version has much poorer performance.
Benchmark result on A40:
Why would type annotation affect the resulting kernel performance?
Beta Was this translation helpful? Give feedback.
All reactions