We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 3f992aa commit df50eb4Copy full SHA for df50eb4
lmdeploy/pytorch/kernels/cuda/blocked_gemm_fp8.py
@@ -11,7 +11,7 @@
11
logger = get_logger('lmdeploy')
12
13
14
-@triton.jit
+@triton.jit(do_not_specialize=['M', 'M_out'])
15
def _quant_fp8_kernel(
16
a_ptr,
17
out_ptr,
0 commit comments