-
Couldn't load subscription status.
- Fork 247
[CK_TILE] Stream-K Gemm Example for fp8 and bf8 #3041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to consolidate new fp8 tests into smoke tests with existing test harness.
3114a67 to
92f2fcc
Compare
b55587b to
ffe459a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Great job on refactoring the example and tests :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Approved pending CI!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR extends Stream-K GEMM support to fp8 and bf8 data types, building on the existing fp16 and bf16 implementation. The changes introduce type configuration headers, test suite infrastructure, and example code updates to handle the new 8-bit floating-point formats with atomic reduction support.
- Adds fp8 and bf8 type definitions and layout configurations
- Implements smoke tests covering multiple layout combinations (RRR, RCR, CRR, CCR) for both data types
- Updates the basic example to support fp8 and bf8 with appropriate command-line interface changes
Reviewed Changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/ck_tile/gemm_streamk/test_gemm_streamk_types_fp8.hpp | Defines F8Layouts struct and macros for fp8 test configurations |
| test/ck_tile/gemm_streamk/test_gemm_streamk_types_bf8.hpp | Defines BF8Layouts struct and macros for bf8 test configurations |
| test/ck_tile/gemm_streamk/test_gemm_streamk_types.hpp | Adds F8 and BF8 type aliases and includes new type headers |
| test/ck_tile/gemm_streamk/smoke_tests/*.cpp | Eight new smoke test files covering fp8/bf8 with different layouts |
| test/ck_tile/gemm_streamk/CMakeLists.txt | Adds compile options for OCP FP8 support and registers new test files |
| example/ck_tile/40_streamk_gemm/streamk_gemm_basic.cpp | Adds fp8 and bf8 case handling in the example runner |
| example/ck_tile/40_streamk_gemm/gemm_utils.hpp | Adds DataTypeTraits specializations and updates documentation |
| example/ck_tile/40_streamk_gemm/README.md | Updates documentation to reflect fp8/bf8 support |
| example/ck_tile/40_streamk_gemm/CMakeLists.txt | Adds OCP FP8 compile options for the example |
| example/ck_tile/CMakeLists.txt | Adds trailing newline |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ffe459a to
cd8d967
Compare
Refactored the unit tests for fp8/bf8 to utilize the test harness. Implemented smoke tests with layouts: CCR, CRR, RCR, RRR for fp8/bf8. The tests are using 128x128x32 for the tile configuration, as other configurations revealed implementation gaps that are currently being documented.
cd8d967 to
b0f39cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Proposed changes
This PR is expanding the supported datatypes for Stream-K Gemm by adding examples for fp8 and bf8. Previously examples for fp16 and bf 16 were added through this PR. Currently these examples only support atomic reduction. Unit tests for fp8 and bf8 have been added through a test suite.
Checklist
Please put an
xinto the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.clang-formaton all changed files