[BugFix] Eagerly abort cancelled final-step requests #29987

njhill · 2025-12-03T18:27:24Z

Currently, when requests are cancelled while executing their final step, "completion" of those requests is subsequently handled based on normal stop processing (e.g. length or stop token), and so the abort essentially has no effect.

This is typically not a problem since the final output would be ignored/discarded in this case anyhow. When a kv connector is involved however, it means that the connector will think the request completed successfully rather than being aborted.

This has turned out to be problematic for disaggregated prefill which will free the kv cache blocks if the request was aborted but not if it thinks the request has completed successfully. Since the top-level request was cancelled, it will never be sent to the decode side and so the kv cache blocks remain pinned unnecessarily until the fall-back timeout expires.

The problem is exacerbated when a large number of requests are cancelled and/or there are large prefills whose forward pass takes a long time (since the window for this to occur is bigger).

This PR fixes the problem by explicitly processing any pending aborts immediately prior to processing the model output each step. We process only the aborts and not new requests since it's still preferable for latency reasons to process the model outputs before new incoming requests.

Fixes #26400.

robertgshaw2-redhat · 2025-12-03T18:38:54Z

Could you provide some more detailed explanation about what was happening before + why this fixes it?

This is pretty complicated logic so I think we will value the posterity

Signed-off-by: Nick Hill <[email protected]>

njhill · 2025-12-03T19:28:03Z

@robertgshaw2-redhat I've now added some explanations.

Signed-off-by: Nick Hill <[email protected]>

njhill mentioned this pull request Dec 3, 2025

[Bugfix] Free requests to avoid a KV Cache exhaustion during VLLM_NIXL_ABORT_REQUEST_TIMEOUT #29906

Open

mergify bot added the v1 label Dec 3, 2025

[BugFix] Eagerly abort final-step requests

8c3bc3b

Signed-off-by: Nick Hill <[email protected]>

njhill force-pushed the eager-aborts branch from bf56652 to 8c3bc3b Compare December 3, 2025 18:46

add more comments

15c50f7

Signed-off-by: Nick Hill <[email protected]>

small optimization

f620ad1

Signed-off-by: Nick Hill <[email protected]>

njhill force-pushed the eager-aborts branch from c91a9d1 to f620ad1 Compare December 3, 2025 20:02

njhill marked this pull request as ready for review December 3, 2025 20:02

njhill changed the title ~~[BugFix] Eagerly abort final-step requests~~ [BugFix] Eagerly abort cancelled final-step requests Dec 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BugFix] Eagerly abort cancelled final-step requests #29987

[BugFix] Eagerly abort cancelled final-step requests #29987

njhill commented Dec 3, 2025 •

edited by github-actions bot

Loading

Uh oh!

robertgshaw2-redhat commented Dec 3, 2025

Uh oh!

njhill commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[BugFix] Eagerly abort cancelled final-step requests #29987

Are you sure you want to change the base?

[BugFix] Eagerly abort cancelled final-step requests #29987

Conversation

njhill commented Dec 3, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robertgshaw2-redhat commented Dec 3, 2025

Uh oh!

njhill commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

njhill commented Dec 3, 2025 •

edited by github-actions bot

Loading