Refactor: Move matrix packing outside GEMM kernels #10

shalinib-ibm · 2025-06-05T11:53:50Z

In class tinyBLAS_PPC, previously, packing of input matrices A and B was performed on-the-fly within each GEMM microkernel. This patch refactors the code to decouple packing from kernel by introducing a preprocessing step that packs matrices once before any kernel is invoked.

Make sure to read the contributing guidelines before submitting a PR

In class tinyBLAS_PPC, previously, packing of input matrices A and B was performed on-the-fly within each GEMM microkernel. This patch refactors the code to decouple packing from kernel by introducing a preprocessing step that packs matrices once before any kernel is invoked. Benefits: - Enables better memory locality and data reuse - Simplifies the kernel logic by focusing purely on computation - Improves overall GEMM performance, especially for large matrix sizes Signed-off-by: Shalini Salomi Bodapati <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor: Move matrix packing outside GEMM kernels #10

Refactor: Move matrix packing outside GEMM kernels #10

Uh oh!

shalinib-ibm commented Jun 5, 2025

Uh oh!

Uh oh!

Refactor: Move matrix packing outside GEMM kernels #10

Are you sure you want to change the base?

Refactor: Move matrix packing outside GEMM kernels #10

Uh oh!

Conversation

shalinib-ibm commented Jun 5, 2025

Uh oh!

Uh oh!