-
Notifications
You must be signed in to change notification settings - Fork 22
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Thanks a lot for making this package, I am keen to switching to this for a production app.
Many of the language model providers that AnyLanguageModel already supports have strong multimodal capabilities:
MLX: Supports vision-language models like Qwen2-VL, and other multimodal models
OpenAI: GPT-4o, GPT-4o-mini, and GPT-4-turbo all support vision
Currently, users cannot leverage these vision capabilities through AnyLanguageModel, which limits the library's usefulness for multimodal applications.
Use Cases:
- Image captioning and description
- OCR and document understanding
mattt and KotlinFactorysidster-io
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request