Skip to content

Conversation

@bnellnm
Copy link
Collaborator

@bnellnm bnellnm commented Nov 20, 2025

Purpose

After making select_experts a non-static method (#29067), we can avoid passing most of the arguments to FusedMoEMethodBase.apply and get them from the layer directly.

This doesn't really decrease flexibility, the MoE quantization methods were already fairly tightly coupled with FusedMoE. A further step could be making FusedMoE an abstract base class that simply provides the quant_method parameters. Then the current FusedMoE class would become a subclass of that.

Test Plan

CI

Test Result

distributed-tests-2-gpus - looks like a spurious failure

cc @varun-sundar-rabindranath


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify
Copy link

mergify bot commented Nov 20, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @bnellnm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Nov 20, 2025
@bnellnm bnellnm changed the title Simplify apply Remove most arguments to FusedMoEMethodBase.apply Nov 20, 2025
@bnellnm bnellnm changed the title Remove most arguments to FusedMoEMethodBase.apply Remove most arguments to FusedMoEMethodBase.apply Nov 20, 2025
@bnellnm bnellnm marked this pull request as ready for review November 20, 2025 16:22
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@LucasWilkinson
Copy link
Collaborator

Looks like we need to land #29067 first? marking draft until that lands 👍

@LucasWilkinson LucasWilkinson marked this pull request as draft November 21, 2025 19:42
@mergify mergify bot added the nvidia label Nov 21, 2025
@mergify mergify bot removed the needs-rebase label Nov 21, 2025
@bnellnm
Copy link
Collaborator Author

bnellnm commented Nov 21, 2025

  • draft

Yeah, sorry this should have been a draft to start with. It needs the other PR to land first.

@bnellnm bnellnm changed the title Remove most arguments to FusedMoEMethodBase.apply [MoE][Refactor] Remove most arguments to FusedMoEMethodBase.apply Nov 21, 2025
@bnellnm bnellnm marked this pull request as ready for review November 24, 2025 19:38
@mergify
Copy link

mergify bot commented Dec 1, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @bnellnm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Dec 1, 2025
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Copy link
Collaborator

@LucasWilkinson LucasWilkinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; thanks for doing this! this is amazing!

@github-project-automation github-project-automation bot moved this to In review in NVIDIA Dec 3, 2025
@LucasWilkinson LucasWilkinson added ready ONLY add when PR is ready to merge/full CI is needed ready-run-all-tests Trigger CI with all tests for wide-ranging PRs labels Dec 3, 2025
@bnellnm
Copy link
Collaborator Author

bnellnm commented Dec 3, 2025

LGTM; thanks for doing this! this is amazing!

Thanks for the review!

Signed-off-by: Bill Nell <[email protected]>
@LucasWilkinson LucasWilkinson enabled auto-merge (squash) December 4, 2025 03:22
Signed-off-by: Tyler Michael Smith <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

nvidia ready ONLY add when PR is ready to merge/full CI is needed ready-run-all-tests Trigger CI with all tests for wide-ranging PRs

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

4 participants