Skip to content

Conversation

@harsh-nod
Copy link
Contributor

No description provided.

@harsh-nod harsh-nod requested a review from adedespirlet July 8, 2025 17:15
@harsh-nod
Copy link
Contributor Author

@adedespirlet - does this PR look correct to you?

@adedespirlet
Copy link

Thanks for taking a look at it! Unfortunately I’m still seeing the same code being emitted and the same incorrect results as described in #981.
The code still results in multiple threads reading from the same location in the result vector of the MMA. I think fixing this requires changes in the codegen logic that propagates affine maps across ops with mismatched data layouts. The access pattern for the result of the MMA maybe needs to be recomputed rather than inherited. I am happy to dive more into that after completing RMSnorm kernel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants