-
Notifications
You must be signed in to change notification settings - Fork 290
Fix MiniCPM-o-2_6 #2847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix MiniCPM-o-2_6 #2847
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request fixes a bug in the WWB implementation where _OVMiniCPMVForCausalLM.preprocess_inputs()
failed to apply chat templates for MiniCPM-o-2_6 and MiniCPM-V-2_6 models. The fix changes the image tag format from (<image>./</image>)\n
to <image>./</image>\n
across documentation and implementation files.
Key Changes
- Updated image tag format for MiniCPM models from
(<image>./</image>)\n
to<image>./</image>\n
- Added support for new MiniCPM-o-2_6 model variant
- Fixed related algorithm implementations in the MiniCPM vision encoder
Reviewed Changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
tests/python_tests/test_vlm_pipeline.py | Updated test to use correct image tag format without parentheses |
src/python/py_vlm_pipeline.cpp | Updated documentation strings to show correct image tag format for both MiniCPM variants |
src/python/openvino_genai/py_openvino_genai.pyi | Updated type stub documentation to reflect correct image tag format |
src/cpp/src/visual_language/minicpm/classes.hpp | Updated resample method signature to use single target size with padding parameter |
src/cpp/src/visual_language/minicpm/classes.cpp | Refactored implementation with improved algorithms and removed parentheses from image tag |
src/cpp/src/visual_language/inputs_embedder.hpp | Updated comment to reflect correct image tag format |
src/cpp/src/debug_utils.hpp | Added support for boolean numpy array type |
src/cpp/include/openvino/genai/visual_language/pipeline.hpp | Updated API documentation with correct image tag format |
site/docs/use-cases/image-processing/_sections/_usage_options/index.mdx | Updated documentation to separate MiniCPM-o-2_6 and MiniCPM-V-2_6 with correct tags |
site/docs/supported-models/index.mdx | Added notes about MiniCPM-o compatibility issues |
site/docs/supported-models/_components/vlm-models-table/models.ts | Added MiniCPM-o-2_6 model to supported models table |
samples/cpp/visual_language_chat/README.md | Removed specific model reference to make documentation more generic |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
939001f
Description
WWB uses
_OVMiniCPMVForCausalLM.preprocess_inputs()
foroptimum-intel
and--hf
. That function fails to apply chat_template forMiniCPM-o-2_6
andMiniCPM-V-2_6
. A fix is required foroptimum-intel
. It will take time to validate both of the models with that fix. This PR addresses other identified issues. The original implementation applies chat_template: https://huggingface.co/openbmb/MiniCPM-o-2_6/blob/main/modeling_minicpmo.py#L959 but it's not used in WWB.WWB results for
--weights-format fp16
and unpublished fix foroptimum-intel
chat_template:Docs: https://wovchena.github.io/openvino.genai-public/docs/supported-models/#visual-language-models-vlms
That should be enough for this release but follow up PRs are needed. The ideal solution includes:
sin()
andcos()
withnumpy
with hardcoding commitsin()
andcos()
valuesCVS-170987
Checklist: