-
Notifications
You must be signed in to change notification settings - Fork 699
Open
Labels
backend::sglangRelates to the sglang backendRelates to the sglang backendenhancementNew feature or requestNew feature or requestmultimodal
Description
Feature request
Description
Add --mm-prompt-template argument to SGLang to achieve feature parity with vLLM
Describe the problem you're encountering
Problem
Currently, vLLM (args.py:165) supports --mm-prompt-template for customizing multimodal prompts, but SGLang does not. This means:
- SGLang users cannot customize how images and text are combined in prompts
- Users migrating from vLLM to SGLang lose this functionality
- Different vision models may require different prompt formats
Proposed Solution
Add --mm-prompt-template argument to SGLang that:
- Accepts a template string with
<prompt>and<image>placeholders - Replaces
<prompt>with user text - Replaces
<image>with the model's image token - Has a sensible default:
"USER: <image>\n<prompt> ASSISTANT:"
Example Usage
python -m dynamo.sglang --model Qwen/Qwen2.5-VL-7B-Instruct
--multimodal-processor
--mm-prompt-template "Analyze \n\nUser: \n\nAssistant:"## Implementation
This requires changes to:
components/src/dynamo/sglang/args.py- Add argument definitioncomponents/src/dynamo/sglang/main.py- Pass to handlercomponents/src/dynamo/sglang/request_handlers/multimodal/processor_handler.py- Accept templatecomponents/src/dynamo/sglang/multimodal_utils/multimodal_chat_processor.py- Use templatecomponents/src/dynamo/sglang/tests/test_sglang_unit.py- Add tests
References
- vLLM implementation:
components/src/dynamo/vllm/args.py:165-176 - TensorRT-LLM implementation:
components/src/dynamo/trtllm/utils/trtllm_utils.py:278
I Can Work On This
I'm ready to implement this feature and submit a PR if approved.
Describe alternatives you've tried
No response
Metadata
Metadata
Assignees
Labels
backend::sglangRelates to the sglang backendRelates to the sglang backendenhancementNew feature or requestNew feature or requestmultimodal