[CI/Build][AMD] Use ROCM_ATTN instead of FLASH_ATTN test for test_register_kv_caches for ROCm and update test for TRITON_ATTN #29985

rasmith · 2025-12-03T18:13:25Z

This test skips FLASH_ATTN for test_register_kv_caches since it is not supported on ROCm.

This also updates test_register_kv_caches for TRITON_ATTN which was failing with the following error:

            # Verify get_reg_descs was called with caches_data
            assert mock_wrapper_instance.get_reg_descs.called
            caches_data, _ = mock_wrapper_instance.get_reg_descs.call_args[0]
>           assert len(caches_data) == 4
E           AssertionError: assert 2 == 4
E            +  where 2 = len([(588510336, 32768, 0, ''), (588769344, 32768, 0, '')])

This is because FLASH_MLA and TRITON_MLA use different shapes for KV cache, according to get_kv_cache_shape, in particular, TritonAttentionBackend.get_kv_cache_shape returns something of the form [num_blocks, 2, H, N, D], which causes TpKVTopology to set self._is_kv_layout_blocks_first to True for TRITON_ATTN, but False for FLASH_ATTN.

I adjusted the expected outputs in the test to reflect the expected differences in outputs.

=================

I found a second problem when using ROCM_ATTN instead of FLASH_ATTN on ROCm. When the test runs with the first backend, the _cached_get_attn_backend function was returning the backend from the previous test run (FLASH_ATTN on upstream CI since it runs before the TRITON_ATTN test) and as a result, simply retesting the previous backend.

So, I mocked the get_attn_backend function to return the backend that we want to test

=================

All tests pass now.

Signed-off-by: Randall Smith <[email protected]>

gemini-code-assist

Code Review

This pull request updates the test_register_kv_caches test to correctly handle differences in KV cache shapes between FLASH_ATTN and TRITON_ATTN backends. The changes adjust expected values for tensor sizes, base addresses, and block lengths based on the attention backend being tested. The logic seems correct and addresses the test failure described. I have one suggestion to make the test more robust and readable.

tests/v1/kv_connector/unit/test_nixl_connector.py

Signed-off-by: Randall Smith <[email protected]>

tjtanaa · 2025-12-04T05:53:23Z

tests/v1/kv_connector/unit/test_nixl_connector.py

+            "FLASH_ATTN",
+            marks=pytest.mark.skipif(
+                current_platform.is_rocm(),
+                reason="Attention backend FLASH_ATTN is not supported on ROCm",


Can you try if ROCM_AITER_FA or ROCM_ATTN is suitable to replace FLASH_ATTN?
FLASH_ATTN, ROCM_AITER_FA, andROCM_ATTN have the same kvcache layout.

@tjtanaa So, I was able to do this and it worked! However, it turns out that get_attn_backend was using _cached_get_attn_backend which was returning the backend class from the previous run, so FLASH_ATTN was being retesting during the TRITON_ATTN test run. I mocked get_attn_backend, and now all tests pass and test against the correct backend.

Signed-off-by: Randall Smith <[email protected]>

Update test_register_kv_caches for TRITON_ATTN

cc40c30

Signed-off-by: Randall Smith <[email protected]>

rasmith requested a review from ApostaC as a code owner December 3, 2025 18:13

mergify bot added v1 kv-connector labels Dec 3, 2025

gemini-code-assist bot reviewed Dec 3, 2025

View reviewed changes

tests/v1/kv_connector/unit/test_nixl_connector.py Show resolved Hide resolved

AndreasKaratzas mentioned this pull request Dec 3, 2025

[AMD][CI Failure]: Tracking V1 Test others failing tests #29116

Open

rasmith and others added 2 commits December 3, 2025 12:17

Merge branch 'vllm-project:main' into ransmith_fix_test_nixl_connector

74939de

skip FLASH_ATTN

d615261

Signed-off-by: Randall Smith <[email protected]>

rasmith changed the title ~~[CI/Build] Update test_register_kv_caches for TRITON_ATTN~~ [CI/Build][AMD] Update test_register_kv_caches for ROCm and TRITON_ATTN Dec 3, 2025

rasmith changed the title ~~[CI/Build][AMD] Update test_register_kv_caches for ROCm and TRITON_ATTN~~ [CI/Build][AMD] Skip FLASH_ATTN test for test_register_kv_caches for ROCm and update test for TRITON_ATTN Dec 3, 2025

mergify bot added the rocm Related to AMD ROCm label Dec 3, 2025

tjtanaa reviewed Dec 4, 2025

View reviewed changes

rasmith and others added 2 commits December 4, 2025 19:29

Merge branch 'vllm-project:main' into ransmith_fix_test_nixl_connector

5012758

use ROCM_ATTN and ensure get_attn_backend returns mocked value

dc9f8bf

Signed-off-by: Randall Smith <[email protected]>

rasmith changed the title ~~[CI/Build][AMD] Skip FLASH_ATTN test for test_register_kv_caches for ROCm and update test for TRITON_ATTN~~ [CI/Build][AMD] Use ROCM_ATTN instead of FLASH_ATTN test for test_register_kv_caches for ROCm and update test for TRITON_ATTN Dec 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[CI/Build][AMD] Use ROCM_ATTN instead of FLASH_ATTN test for test_register_kv_caches for ROCm and update test for TRITON_ATTN #29985

[CI/Build][AMD] Use ROCM_ATTN instead of FLASH_ATTN test for test_register_kv_caches for ROCm and update test for TRITON_ATTN #29985

rasmith commented Dec 3, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

tjtanaa Dec 4, 2025 •

edited

Loading

Uh oh!

rasmith Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[CI/Build][AMD] Use ROCM_ATTN instead of FLASH_ATTN test for test_register_kv_caches for ROCm and update test for TRITON_ATTN #29985

Are you sure you want to change the base?

[CI/Build][AMD] Use ROCM_ATTN instead of FLASH_ATTN test for test_register_kv_caches for ROCm and update test for TRITON_ATTN #29985

Conversation

rasmith commented Dec 3, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

tjtanaa Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rasmith Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rasmith commented Dec 3, 2025 •

edited by github-actions bot

Loading

tjtanaa Dec 4, 2025 •

edited

Loading