[asm] Add pipelined double-buffering support with SGPR rotation#876
Merged
harsh-nod merged 1 commit intoiree-org:mainfrom Feb 17, 2026
Merged
[asm] Add pipelined double-buffering support with SGPR rotation#876harsh-nod merged 1 commit intoiree-org:mainfrom
harsh-nod merged 1 commit intoiree-org:mainfrom
Conversation
Implement memref iter_arg handling for pipelined GEMM with g2s in the C++ WaveASM backend. When scf.for carries memref iter_args for double-buffering, the LDS base offsets are now materialized as SGPRs and rotated at the loop tail using s_mov_b32 swap sequences. Key changes: - RegionBuilder: detect LDS memref iter_args, resolve to SGPR offsets, propagate through block args, handle cross-swap at yield - TranslateFromMLIR: use V_ADD_U32 directly with SGPR offsets in vector.load/store (V_MOV_B32 rejects SGPR sources) - AMDGPUHandlers: handle dynamic SGPR-carried LDS base offsets in gather_to_lds m0 computation, prefer SALU when both operands are SGPRs - LinearScanPass: fix block arg type propagation to use allocation mapping directly instead of condition iter_arg types (broken for cross-swap patterns) - AssemblyEmitter: emit SGPR rotation copies at loop tail, detecting independent swap pairs and using 3-instruction swap with temporary Signed-off-by: Harsh Menon <harsh.menon@amd.com>
panditsa
approved these changes
Feb 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implement memref iter_arg handling for pipelined GEMM with g2s in the C++ WaveASM backend. When scf.for carries memref iter_args for double-buffering, the LDS base offsets are now materialized as SGPRs and rotated at the loop tail using s_mov_b32 swap sequences.
Key changes: