OVMS error when serving Qwen 2.5 VL with GPU

**Describe the bug**

An error is seen in the logs, and then OVMS exits.

```
[2025-09-05 10:48:02.793][60609][serving][error][servable_initializer.cpp:145] Error during llm node initialization for models_path: /model-repo/Qwen2.5-VL-7B-Instruct-int4-npu-ov/./ exception: Exception from src/inference/src/cpp/core.cpp:126:
Exception from src/inference/src/dev/plugin.cpp:58:
Check 'false' failed at src/plugins/intel_gpu/src/plugin/program_builder.cpp:163:
[GPU] ProgramBuilder build failed!
Program build failed(0_part_0):
```

**To Reproduce**
Steps to reproduce the behavior:
1. Export Qwen 2.5 VL: `optimum-cli export openvino --weight-format int4 --sym --group-size -1 --model Qwen/Qwen2.5-VL-7B-Instruct Qwen2.5-VL-7B-Instruct-int4-npu-ov`
1. OVMS launch command: `ovms --rest_port 8080 --rest_bind_address 127.0.0.1 --source_model Qwen2.5-VL-7B-Instruct-int4-npu-ov --model_repository_path /model-repo --target_device GPU --log_level DEBUG --task text_generation`
1. See error

**Expected behavior**
OVMS prints a vague error, not describing what the cause is. If the problem is an unsupported GPU, then that should be reported.

**Logs**

[omvs-qwen-logs.txt](https://github.com/user-attachments/files/22168555/omvs-qwen-logs.txt)

**Configuration**
1. OVMS version: 2025.3.0.6e2e910de
2. OVMS config.json file: none
3. CPU, accelerator's versions if applicable: n/a
4. Model repository directory structure
```
$ tree model-repo
model-repo
├── Qwen2.5-VL-7B-Instruct-int4-npu-ov
│   ├── added_tokens.json
│   ├── chat_template.jinja
│   ├── config.json
│   ├── generation_config.json
│   ├── merges.txt
│   ├── openvino_config.json
│   ├── openvino_detokenizer.bin
│   ├── openvino_detokenizer.xml
│   ├── openvino_language_model.bin
│   ├── openvino_language_model.xml
│   ├── openvino_text_embeddings_model.bin
│   ├── openvino_text_embeddings_model.xml
│   ├── openvino_tokenizer.bin
│   ├── openvino_tokenizer.xml
│   ├── openvino_vision_embeddings_merger_model.bin
│   ├── openvino_vision_embeddings_merger_model.xml
│   ├── openvino_vision_embeddings_model.bin
│   ├── openvino_vision_embeddings_model.xml
│   ├── preprocessor_config.json
│   ├── special_tokens_map.json
│   ├── tokenizer_config.json
│   ├── tokenizer.json
│   └── vocab.json
└── README.md

2 directories, 25 files
```

6. Model or publicly available similar model that reproduces the issue: see step 1 under To Reproduce

**Additional context**

This same model and ovms command works on an Arc A580 GPU. Using an older iGPU produces the error reported here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OVMS error when serving Qwen 2.5 VL with GPU #3635

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OVMS error when serving Qwen 2.5 VL with GPU #3635

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions