[rocm]use aiter triton kernel as triton mha fallback path #809

zhuyuhua-v · 2025-11-14T02:13:52Z

Purpose

use aiter triton kernel as triton mha fallback path instead of aiter fmha kernel.

Test Plan

server:

export VLLM_USE_V1=1
export SAFETENSORS_FAST_GPU=1
export VLLM_ROCM_USE_AITER=1
export VLLM_ROCM_USE_AITER_RMSNORM=1
export VLLM_ROCM_USE_AITER_MOE=1
export VLLM_USE_TRITON_FLASH_ATTN=1
export NCCL_DEBUG=WARN
export VLLM_RPC_TIMEOUT=1800000
export VLLM_ROCM_USE_AITER_MHA=0
export VLLM_ROCM_USE_TRITON_ROPE=1
export VLLM_ROCM_USE_AITER_FUSION_SHARED_EXPERTS=1

export VLLM_ROCM_USE_AITER_MLA=0 # triton path

model_path="path_to_model/deepseek-ai/DeepSeek-V3"
vllm serve $model_path \
    --tensor-parallel-size 8 \
    --max-num-batched-tokens 32768 \
    --trust-remote-code \
    --no-enable-prefix-caching \
    --disable-log-requests \
    --gpu_memory_utilization 0.9 \
    --port 6789 \
    --compilation-config '{"cudagraph_mode": "FULL_AND_PIECEWISE"}' \
    --block-size 16 \
    --async-scheduling \
    --enforce-eager \

accuracy:

model="path_to_model/deepseek-ai/DeepSeek-V3"
lm_eval \
--model local-completions \
--tasks gsm8k \
--seed 123 \
--model_args model=${model},base_url=http://127.0.0.1:6789/v1/completions \
--batch_size 100 \

Test Result

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
gsm8k	3	flexible-extract	5	exact_match	↑	0.9477	±	0.0061
		strict-match	5	exact_match	↑	0.9454	±	0.0063

Signed-off-by: zhuyuhua-v <[email protected]>

use aiter triton kernel as triton mha fallback path

31ea5c1

Signed-off-by: zhuyuhua-v <[email protected]>

zhuyuhua-v requested review from kliuae-amd, tjtanaavllm, wuhuikx and zejunchen-zejun as code owners November 14, 2025 02:13

zhuyuhua-v changed the title ~~use aiter triton kernel as triton mha fallback path~~ [rocm]use aiter triton kernel as triton mha fallback path Nov 14, 2025

zhuyuhua-v marked this pull request as draft November 14, 2025 05:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[rocm]use aiter triton kernel as triton mha fallback path #809

[rocm]use aiter triton kernel as triton mha fallback path #809

Uh oh!

zhuyuhua-v commented Nov 14, 2025 •

edited by github-actions bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[rocm]use aiter triton kernel as triton mha fallback path #809

Are you sure you want to change the base?

[rocm]use aiter triton kernel as triton mha fallback path #809

Uh oh!

Conversation

zhuyuhua-v commented Nov 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhuyuhua-v commented Nov 14, 2025 •

edited by github-actions bot

Loading