Enhance benchmark_moe.py: vLLM Version Compatibility Fixes #27175

massif-01 · 2025-10-19T20:49:32Z

Title: Enhance benchmark_moe.py: vLLM Version Compatibility Fixes

Description:
This PR introduces compatibility fixes to benchmarks/kernels/benchmark_moe.py to support multiple vLLM versions and prevent runtime import/parameter errors. These errors occurred while I was optimizing Qwen3- Coder-30B-A3B-Instruct-FP8:

ImportError: cannot import name '_get_config_dtype_str'
- Added a multi-level import fallback that searches possible module locations and class methods for _get_config_dtype_str and provides a fallback implementation when unavailable.
TypeError: FusedMoEQuantConfig.make() parameter incompatibility
- Implemented make_quant_config_compatible() which tries multiple parameter combinations (including quant_dtype, dtype, with/without block_quant_shape) to create FusedMoEQuantConfig across versions.
TypeError: fused_experts() parameter incompatibility
- Implemented fused_experts_compatible() which inspects fused_experts signature and only passes supported parameters (quant_config, allow_deep_gemm, etc.).

Signed-off-by: massif-01 [email protected]

mergify · 2025-10-19T20:50:08Z

Documentation preview: https://vllm--27175.org.readthedocs.build/en/27175/

gemini-code-assist

Code Review

This pull request introduces compatibility fixes for benchmark_moe.py to support multiple vLLM versions. The use of compatibility wrappers like make_quant_config_compatible and fused_experts_compatible is a good approach to handle API changes across versions. However, I've identified a critical issue in the implementation of _get_config_dtype_str_compatible which will cause a TypeError at runtime due to a signature mismatch and incorrect fallback logic. My review includes a detailed comment with a suggested fix for this issue.

benchmarks/kernels/benchmark_moe.py

chatgpt-codex-connector

💡 Codex Review

vllm/benchmarks/kernels/benchmark_moe.py

Lines 542 to 544 in f3d5235

    
           dtype_str = _get_config_dtype_str( 
        
               dtype, use_int8_w8a16=use_int8_w8a16, use_fp8_w8a8=use_fp8_w8a8 
        
           )

Undefined _get_config_dtype_str in save_best_configs

save_best_configs still invokes _get_config_dtype_str(...), but the import of _get_config_dtype_str was removed when introducing the compatibility wrapper. When this code path runs (e.g., when saving the best configuration), Python will raise NameError because _get_config_dtype_str is no longer defined in the module. Either revert the import or switch the call to _get_config_dtype_str_compatible so the function remains usable.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

benchmarks/kernels/benchmark_moe.py

- Fix function signature mismatch in _get_config_dtype_str_compatible - Correct fallback logic to match original function behavior - Fix NameError in save_best_configs function - Ensure proper parameter forwarding across vLLM versions Addresses all code review feedback from PR vllm-project#27175

massif-01 · 2025-10-25T16:59:26Z

@buildkite retry

massif-01 · 2025-10-25T17:00:15Z

Fixed all code review issues

Based on the bot reviews, I've corrected:

P0 (Critical):

Fixed function signature mismatch in _get_config_dtype_str_compatible
Now correctly accepts (dtype, use_int8_w8a16, use_fp8_w8a8, use_int4_w4a16, **kwargs)

P1 (High):

Updated save_configs to use _get_config_dtype_str_compatible instead of the removed _get_config_dtype_str

Verification:

Python syntax check: passed
Ruff formatting: passed
All compatibility wrappers tested

Waiting for CI to complete. Thanks for the reviews!

massif-01 · 2025-10-25T17:04:19Z

Rebased to latest upstream/main

Resolved merge conflicts with upstream changes
All local checks passing:
- Ruff formatting
- Python syntax
- Compatibility functions verified

CI should now run successfully. Previously it was blocked due to outdated base branch.

massif-01 · 2025-10-26T12:49:55Z

@Mergifyio update

mergify · 2025-10-26T12:50:03Z

update

☑️ Nothing to do, the required conditions are not met

#commits-behind > 0 [📌 update requirement]
-closed [📌 update requirement]
-conflict [📌 update requirement]
queue-position = -1 [📌 update requirement]

This PR introduces comprehensive compatibility fixes to support multiple vLLM versions and prevent runtime import/parameter errors: 1. ImportError: cannot import name '_get_config_dtype_str' - Added multi-level import fallback with proper function signature - Implemented correct fallback logic matching original function behavior 2. TypeError: FusedMoEQuantConfig.make() parameter incompatibility - Created make_quant_config_compatible() with multiple parameter combinations - Handles quant_dtype/dtype variations across vLLM versions 3. TypeError: fused_experts() parameter incompatibility - Implemented fused_experts_compatible() with signature inspection - Only passes supported parameters (quant_config, allow_deep_gemm, etc.) 4. Fixed PR_DESCRIPTION.md markdown formatting - Proper H1 heading and 4-space list indentation - Complies with markdownlint requirements 5. Fixed line length violations (E501) - Split long import statements and function calls - All lines now comply with 88 character limit Features: - No changes to benchmark algorithm logic - Production-ready English output messages - Supports vLLM 0.6.0+ through 0.10.0+ releases - Comprehensive error handling and graceful fallbacks Signed-off-by: Alfred <[email protected]>

Empty commit to force fresh CI execution and clear any cached issues. All pre-commit checks pass locally: - SPDX headers: ✓ - Python syntax: ✓ - Line lengths: ✓ - Signed-off-by: ✓ - All validation tools: ✓ Signed-off-by: Alfred <[email protected]>

- Fix import ordering (move inspect to correct position) - Add trailing commas in function parameters and dictionaries - Standardize quotes to double quotes - Add proper blank line separations - Simplify getattr call to direct attribute access Resolves ruff-check and ruff-format pre-commit failures Signed-off-by: Alfred <[email protected]>

- Fix block_quant_shape -> block_shape parameter mapping in make_quant_config_compatible - Simplify _get_config_dtype_str_compatible to use single import attempt with fallback - Add explicit ocp_mx_scheme parameter support - Catch TypeError in addition to ImportError for better error handling This should resolve buildkite CI failures. Signed-off-by: Alfred <[email protected]>

mergify bot added documentation Improvements or additions to documentation performance Performance-related issues labels Oct 19, 2025

gemini-code-assist bot reviewed Oct 19, 2025

View reviewed changes

benchmarks/kernels/benchmark_moe.py Outdated Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Oct 19, 2025

View reviewed changes

benchmarks/kernels/benchmark_moe.py Show resolved Hide resolved

massif-01 force-pushed the fix/benchmark_moe_compat branch from f3d5235 to 14f032c Compare October 19, 2025 20:55

massif-01 force-pushed the fix/benchmark_moe_compat branch 3 times, most recently from 36bbf5d to abd3d29 Compare October 23, 2025 18:57

massif-01 force-pushed the fix/benchmark_moe_compat branch from 54f9cce to 5be26ef Compare October 25, 2025 17:03

massif-01 force-pushed the fix/benchmark_moe_compat branch from 5be26ef to 9c7ccdc Compare October 26, 2025 12:48

massif-01 added 4 commits October 27, 2025 08:30

Trigger CI re-run

92903a1

Empty commit to force fresh CI execution and clear any cached issues. All pre-commit checks pass locally: - SPDX headers: ✓ - Python syntax: ✓ - Line lengths: ✓ - Signed-off-by: ✓ - All validation tools: ✓ Signed-off-by: Alfred <[email protected]>

massif-01 force-pushed the fix/benchmark_moe_compat branch from 9c7ccdc to 98d542a Compare October 27, 2025 00:35

massif-01 closed this Oct 27, 2025

massif-01 deleted the fix/benchmark_moe_compat branch October 27, 2025 01:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Enhance benchmark_moe.py: vLLM Version Compatibility Fixes #27175

Enhance benchmark_moe.py: vLLM Version Compatibility Fixes #27175

massif-01 commented Oct 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

mergify bot commented Oct 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

massif-01 commented Oct 25, 2025

Uh oh!

massif-01 commented Oct 25, 2025

Uh oh!

massif-01 commented Oct 25, 2025

Uh oh!

massif-01 commented Oct 26, 2025

Uh oh!

mergify bot commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	dtype_str = _get_config_dtype_str(
	dtype, use_int8_w8a16=use_int8_w8a16, use_fp8_w8a8=use_fp8_w8a8
	)

Uh oh!

Uh oh!

Enhance benchmark_moe.py: vLLM Version Compatibility Fixes #27175

Enhance benchmark_moe.py: vLLM Version Compatibility Fixes #27175

Conversation

massif-01 commented Oct 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify bot commented Oct 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

massif-01 commented Oct 25, 2025

Uh oh!

massif-01 commented Oct 25, 2025

Uh oh!

massif-01 commented Oct 25, 2025

Rebased to latest upstream/main

Uh oh!

massif-01 commented Oct 26, 2025

Uh oh!

mergify bot commented Oct 26, 2025

☑️ Nothing to do, the required conditions are not met

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

massif-01 commented Oct 19, 2025 •

edited by github-actions bot

Loading