GuideLLM v0.4.0 'sweep' can shutdown vLLM-cpu Mode server

**Describe the bug**
On a RHEL 9.4 Intel Sapphire Rapids server with container image 'vllm-cpu-release-repo:v0.12.0' -or- 'vllm-cpu-release-repo:v0.11.2' running, starting the following GuideLLM Workload will cause vllm to shutdown during step 1 (sync):

pi-28# guidellm benchmark --target http://localhost:8000 \
    --processor "$PWD/Models/Llama-3.2-1B-Instruct" \
    --rate-type sweep \
    --data "prompt_tokens=32,output_tokens=16"
**GuideLLM hangs** on first test (sync) & causes vLLM server to shutdown
(APIServer pid=1) INFO:     10.88.0.1:39996 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
(APIServer pid=1) INFO:     Shutting down

**Expected behavior**
I don't believe GuideLLM workload should be able to shutdown a vllm server.
It would be acceptable, if the workload cannot be completed, that Guidellm timeout

**Environment**
Include all relevant environment information:
1. OS [e.g. Ubuntu 20.04]: Red Hat Enterprise Linux 9.4 (Plow)
2. kernel:  5.14.0-427.13.1.el9_4.x86_64
3. Python version [e.g. 3.12.2]: Python 3.11.7
4. guidellm version: 0.4.0
5. podman version 4.9.4-rhel

**To Reproduce**
Exact steps to reproduce the behavior:
console1# podman run --name vllm-cpu --rm --privileged=true    --shm-size=4g -p 8000:8000 -e VLLM_CPU_KVCACHE_SPACE=40    -v $PWD/Models:/model    public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.12.0   --model "/model/Llama-3.2-1B-Instruct" --dtype=bfloat16
<wait for Application Server to start>

console2# guidellm benchmark --target http://localhost:8000 \
    --processor "$PWD/Models/Llama-3.2-1B-Instruct" \
    --rate-type sweep \
    --data "prompt_tokens=32,output_tokens=16"

**Errors**
console1
GuideLLM hangs on first test (sync) & causes vLLM server to shutdown
(APIServer pid=1) INFO:     10.88.0.1:39996 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
(APIServer pid=1) INFO:     Shutting down

console2 just hangs in GuideLLM output window, stuck on step 1 (sync)

**Additional context**
Add any other context about the problem here. Also include any relevant files.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GuideLLM v0.4.0 'sweep' can shutdown vLLM-cpu Mode server #499

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GuideLLM v0.4.0 'sweep' can shutdown vLLM-cpu Mode server #499

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions