Skip to content

Commit f3d5235

Browse files
committed
Add PR description for benchmark_moe compatibility fixes
1 parent dc1cd80 commit f3d5235

File tree

1 file changed

+20
-0
lines changed

1 file changed

+20
-0
lines changed

PR_DESCRIPTION.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
Title: Enhance benchmark_moe.py: vLLM Version Compatibility Fixes
2+
3+
Description:
4+
This PR introduces compatibility fixes to `benchmarks/kernels/benchmark_moe.py` to support multiple vLLM versions and prevent runtime import/parameter errors. The following issues are addressed:
5+
6+
1. ImportError: cannot import name '_get_config_dtype_str'
7+
- Added a multi-level import fallback that searches possible module locations and class methods for `_get_config_dtype_str` and provides a fallback implementation when unavailable.
8+
9+
2. TypeError: FusedMoEQuantConfig.make() parameter incompatibility
10+
- Implemented `make_quant_config_compatible()` which tries multiple parameter combinations (including `quant_dtype`, `dtype`, with/without `block_quant_shape`) to create `FusedMoEQuantConfig` across versions.
11+
12+
3. TypeError: fused_experts() parameter incompatibility
13+
- Implemented `fused_experts_compatible()` which inspects `fused_experts` signature and only passes supported parameters (`quant_config`, `allow_deep_gemm`, etc.).
14+
15+
Notes:
16+
- No change to the benchmark algorithm logic.
17+
- All output messages are in English and suitable for production logs.
18+
- These fixes aim to support vLLM 0.6.0+ through 0.10.0+ releases.
19+
20+
Please review and let me know if you'd like additional cleanups or unit tests included.

0 commit comments

Comments
 (0)