-
Notifications
You must be signed in to change notification settings - Fork 74
Add device type meta support for CutlassNvfp4GroupedMmaOp::evaluate
#5695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Review updated until commit 6b57c2d Description
|
| Relevant files | |||
|---|---|---|---|
| Enhancement |
| ||
| Tests |
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| ⚡ Recommended focus areas for review |
Meta device fast path implementation
getRFactorDeviceDimensionIndex for handling rfactor dimensions is appropriate. |
Test failures
-
(Medium, 1)
CUDA out-of-memory in nvFuser TmaPointwiseTest on H100Test Name H100 Source TmaPointwiseTestF.SplitGridDim2D ❌ Link
|
!test |
No description provided.