Releases: ROCm/vllm
Releases · ROCm/vllm
v0.9.2rc2
Full Changelog: v0.9.2rc1...v0.9.2rc2
v0.9.2
Full Changelog: v0.6.0...v0.9.2
v0.9.2rc1
Full Changelog: v0.6.0...v0.9.2rc1
v0.9.1+rocm
What's Changed
- Upstream merge 2025 06 03 by @gshtras in #566
- Fix attention fp8 output fusion for split attention path in v1 by @gshtras in #569
- Upstream merge 2025 06 05 by @gshtras in #570
- Rocm 6.4.1 as base by @gshtras in #571
- Upstream merge 2025 06 09 by @gshtras in #575
- Updated README.md for June 10 release by @Mcirino1 in #574
- Upstream merge 2025 06 12 by @gshtras in #577
Full Changelog: v0.9.0.1+rocm...v0.9.1+rocm
v0.9.0.1+rocm
What's Changed
- Aiter mla cherrypick by @gshtras in #543
- Cherry pick skinny gemms by @gshtras in #544
- Upstream merge 2025 05 15 by @gshtras in #547
- Remove gradlib mention from pyproject by @gshtras in #549
- Fix input layer norm mismatch for Eagle Speculative Decoding compatib… by @mmkamani7 in #548
- Updated README.md by @Mcirino1 in #546
- Upstream merge 2025 05 19 by @gshtras in #550
- Reduce diff from upstream by @gshtras in #551
- Transformers 4.52 fix by @gshtras in #552
- Remove the option to compile cython during docker build by @gshtras in #553
- Fixing pre-commit in github. by @gshtras in #554
- Restrict FP8 attn fusion to triton FA by @gshtras in #545
- Upstream merge 2025 05 27 by @gshtras in #557
- Removing redundant parameters from the MIs side and fixing Navi build by @gshtras in #559
- Removing RPD in favor of torch profiler for V1 by @gshtras in #558
- Added benchmark results and commit hash by @Mcirino1 in #556
- Upstream merge 2025 06 02 by @gshtras in #565
New Contributors
- @mmkamani7 made their first contribution in #548
Full Changelog: v0.8.5+rocm...v0.9.0.1+rocm
v0.8.5+rocm
What's Changed
- Bump hiblaslt by @gshtras in #528
- Restrict setuptools version by @gshtras in #529
- Revert aiter commit by @gshtras in #530
- Update README.md by @t-parry in #531
- Restrict ray version due to a breaking release by @gshtras in #533
- Upstream merge 2025 04 29 by @gshtras in #536
- Upstream merge 2025 05 06 by @gshtras in #538
Full Changelog: v0.8.4+rocm...v0.8.5+rocm
v0.8.4+rocm
What's Changed
- Remove duplicate code in config.py by @sstamenk in #494
- In light of the breaking cmake v4 release by @gshtras in #495
- Docs_update_20250327 by @arakowsk-amd in #493
- Upstream merge 2025 03 31 by @gshtras in #497
- Triton MLA parameter tweak for AMD GPU by @qli88 in #498
- Upstream merge 2025 04 02 by @gshtras in #499
- Bump aiter version by @gshtras in #500
- Adding 2stage MoE support separately until it is added upstream by @gshtras in #501
- Fused FP8 conversion in attention for v1 by @gshtras in #502
- Upstream merge 2025 04 07 by @gshtras in #503
- Fix fused moe by @gshtras in #506
- Update moe_tune_script.sh by @divakar-amd in #507
- Doubled size to wa issue and preserve CAR perf by @maleksan85 in #510
- Re-enable custom paged attention for V0 by @charlifu in #511
- Updated README.md with April 10 results by @Mcirino1 in #512
- Update README.md by @faisalgulfam32 in #514
- Updating base image by @charlifu in #515
- Update test-template.j2 to enable building by @Alexei-V-Ivanov-AMD in #517
- Update test-template.j2 to fix new location of run-amd-test.sh by @Alexei-V-Ivanov-AMD in #518
- Rocm 6.4 docker by @gshtras in #519
- Update README.md by @t-parry in #521
- Update README.md by @t-parry in #523
- Upstream merge 2025 04 21 by @gshtras in #522
- Upstream merge 2025 04 25 by @gshtras in #524
New Contributors
- @sstamenk made their first contribution in #494
- @faisalgulfam32 made their first contribution in #514
Full Changelog: v0.8.2+rocm...v0.8.4+rocm
v0.8.2+rocm
What's Changed
- Stable aiter build by @gshtras in #450
- Remove batch padding on ROCm by @gshtras in #451
- Aiter whl fix branch by @gshtras in #452
- tuning adjustment for quantized skinny gemm. by @amd-hhashemi in #444
- Upstream merge 25 03 03 by @gshtras in #458
- Cython compilation fix by @gshtras in #460
- Building the base images for MI and Navi; Using aiter hotfix by @gshtras in #461
- Hipblaslt bench by @gshtras in #462
- Fixing the shape to use in padding calculation by @gshtras in #464
- Upstream merge 25 03 10 by @gshtras in #471
- V1 rocm support by @maleksan85 in #469
- nightly_fixed_aiter_integration_final_20250305 README update by @Mcirino1 in #470
- Updated README.md with config info and header font size by @Mcirino1 in #473
- Bump aiter version by @gshtras in #476
- upstream_merge_25_03_12 by @gshtras in #478
- Rocm vllm ci fix (new design) by @Alexei-V-Ivanov-AMD in #475
- Add @hongxiayang by @gshtras in #481
- use unified fp8 notation in FA by @divakar-amd in #483
- Removing the padding again after it had been overwritten by upstream merge by @gshtras in #485
- Upstream merge 2025 03 24 by @gshtras in #489
- Using temp dir for the intermediate P3L file by @gshtras in #491
- Using torch commit that supports running scaled_mm on Radeon by @gshtras in #492
New Contributors
Full Changelog: v0.7.3+rocm...v0.8.2+rocm
v0.7.3+rocm
What's Changed
- Initial attempt to adjust codeowners to the ROCm fork by @gshtras in #420
- Applying weight padding to deepseek by @gshtras in #421
- [Model] DeepSeek Tunings by @rasmith in #423
- Removing bad config by @gshtras in #425
- Refine codeowners by @gshtras in #427
- Upstream merge 25 02 17 by @gshtras in #430
- Updating PR template to point people to the upstream repo. by @gshtras in #431
- Enabling the ROCm-vLLM CI on MI250 machines by @Alexei-V-Ivanov-AMD in #432
- Optimization for quantized gemm skinny sizes by @amd-hhashemi in #411
- Restricting FP8 wvSplitk to MI300x by @gshtras in #439
- Remove mi300a by @gshtras in #440
- resolve configs diff for mixtral8x7B by @divakar-amd in #437
- Torch version bump to fix tunable ops by @gshtras in #442
- Using AITER branch with fixed whl by @gshtras in #443
- Bump hipblaslt version. Minor fixes to printing the versions by @gshtras in #447
- Bumping the version in the right place by @gshtras in #448
- Upstream merge 25 02 24 by @gshtras in #449
Full Changelog: v0.7.2+rocm...v0.7.3+rocm
v0.7.2+rocm
What's Changed
- 20250127 docs update by @arakowsk-amd in #392
- Faster Custom Paged Attention kernels by @sanyalington in #372
- Improved memory profiling by @gshtras in #394
- Aiter readme by @gshtras in #400
- fix None dict for quark by @hliuca in #402
- Upstream merge 25 02 03 by @gshtras in #403
- Mbatch p3l by @Alexei-V-Ivanov-AMD in #401
- Fix quark fp8 format loading. by @fxmarty-amd in #395
- WARP_SIZE in sgl moe kernel by @gshtras in #406
- Update README.md 20250205_aiter by @arakowsk-amd in #407
- fix rocm get_device name by @divakar-amd in #359
- Fixing the output formatting in P3L by @gshtras in #414
- Add tuned moe config for qwen1.5_moe_A2.7B by @sky0530 in #398
- Update Benchmark Profiling Scripts by @AdrianAbeyta in #417
- updating 20250207 image manifiest by @arakowsk-amd in #416
- Upstream merge 25 02 10 by @gshtras in #418
- Aiter base by @gshtras in #419
New Contributors
- @arakowsk-amd made their first contribution in #392
- @fxmarty-amd made their first contribution in #395
- @sky0530 made their first contribution in #398
Full Changelog: v0.7.0+rocm...v0.7.2+rocm