Skip to content

Commit 6e4cea1

Browse files
authored
decrement server_load on listen for disconnect (vllm-project#18784)
Signed-off-by: Daniel Salib <[email protected]>
1 parent 435fa95 commit 6e4cea1

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

vllm/entrypoints/utils.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,11 @@ async def listen_for_disconnect(request: Request) -> None:
2626
while True:
2727
message = await request.receive()
2828
if message["type"] == "http.disconnect":
29+
if request.app.state.enable_server_load_tracking:
30+
# on timeout/cancellation the BackgroundTask in load_aware_call
31+
# cannot decrement the server load metrics.
32+
# Must be decremented by with_cancellation instead.
33+
request.app.state.server_load_metrics -= 1
2934
break
3035

3136

0 commit comments

Comments
 (0)