Skip to content

Pull requests: ROCm/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Triton] Shaoclee/355 wip mha rope kv cache
#723 opened Sep 29, 2025 by k50112113 Loading…
Add dispatch for different mha backend
#722 opened Sep 29, 2025 by zhuyuhua-v Draft
5 tasks
Fix attn bug in qwen3-8b benchmark test
#721 opened Sep 28, 2025 by PerryZhang01 Loading…
5 tasks
update aiter fused_moe interface
#720 opened Sep 28, 2025 by zhiding512 Loading…
[FEAT] Add support for AITER bpreshuffle block scale gemm
#717 opened Sep 27, 2025 by tjtanaavllm Loading…
5 tasks
[Perf] refactor attention backend for perf boost
#713 opened Sep 26, 2025 by ganyi1996ppo Loading…
5 tasks
add hipblas in Docker build
#708 opened Sep 25, 2025 by dllehr-amd Loading…
5 tasks
[ROCm] Add allreduce dispatcher for ROCm device
#704 opened Sep 24, 2025 by zejunchen-zejun Loading…
Qwen-next script
#702 opened Sep 24, 2025 by ZhiweiYan-96 Loading…
5 tasks
support rocblas for rocm_unquantized_gemm
#665 opened Sep 10, 2025 by eliotwang Loading…
Add cache config for gpt oss
#656 opened Sep 5, 2025 by cagrikymk Draft
fix flashmla metadata build calls()
#636 opened Aug 19, 2025 by ZJLi2013 Loading…
[Model] Add GPT-OSS model code and config
#625 opened Aug 7, 2025 by ashishtanwer Loading…
add Fused_rms_quant for deepseek_v2 model
#611 opened Jul 29, 2025 by ZJLi2013 Loading…
add fused fp8 bmm
#604 opened Jul 25, 2025 by k50112113 Loading…
Update fp8 paged attention
#592 opened Jul 9, 2025 by amd-xiaoyu12 Draft
ProTip! What’s not been updated in a month: updated:<2025-08-29.