Skip to content

GuideLLM v0.4.0 'sweep' can shutdown vLLM-cpu Mode server #499

@jharriga

Description

@jharriga

Describe the bug
On a RHEL 9.4 Intel Sapphire Rapids server with container image 'vllm-cpu-release-repo:v0.12.0' -or- 'vllm-cpu-release-repo:v0.11.2' running, starting the following GuideLLM Workload will cause vllm to shutdown during step 1 (sync):

pi-28# guidellm benchmark --target http://localhost:8000
--processor "$PWD/Models/Llama-3.2-1B-Instruct"
--rate-type sweep
--data "prompt_tokens=32,output_tokens=16"
GuideLLM hangs on first test (sync) & causes vLLM server to shutdown
(APIServer pid=1) INFO: 10.88.0.1:39996 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
(APIServer pid=1) INFO: Shutting down

Expected behavior
I don't believe GuideLLM workload should be able to shutdown a vllm server.
It would be acceptable, if the workload cannot be completed, that Guidellm timeout

Environment
Include all relevant environment information:

  1. OS [e.g. Ubuntu 20.04]: Red Hat Enterprise Linux 9.4 (Plow)
  2. kernel: 5.14.0-427.13.1.el9_4.x86_64
  3. Python version [e.g. 3.12.2]: Python 3.11.7
  4. guidellm version: 0.4.0
  5. podman version 4.9.4-rhel

To Reproduce
Exact steps to reproduce the behavior:
console1# podman run --name vllm-cpu --rm --privileged=true --shm-size=4g -p 8000:8000 -e VLLM_CPU_KVCACHE_SPACE=40 -v $PWD/Models:/model public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.12.0 --model "/model/Llama-3.2-1B-Instruct" --dtype=bfloat16

console2# guidellm benchmark --target http://localhost:8000
--processor "$PWD/Models/Llama-3.2-1B-Instruct"
--rate-type sweep
--data "prompt_tokens=32,output_tokens=16"

Errors
console1
GuideLLM hangs on first test (sync) & causes vLLM server to shutdown
(APIServer pid=1) INFO: 10.88.0.1:39996 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
(APIServer pid=1) INFO: Shutting down

console2 just hangs in GuideLLM output window, stuck on step 1 (sync)

Additional context
Add any other context about the problem here. Also include any relevant files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions