-
Notifications
You must be signed in to change notification settings - Fork 130
feat: provide unified thinking config among anthropic and gemini models #1461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3968526 to
0ebf149
Compare
0ebf149 to
cf3362d
Compare
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (50.72%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #1461 +/- ##
==========================================
- Coverage 84.22% 84.03% -0.20%
==========================================
Files 141 141
Lines 12978 13039 +61
==========================================
+ Hits 10931 10957 +26
- Misses 1428 1460 +32
- Partials 619 622 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/retest |
|
so if i understand correctly, this goes completely beyond the original @sukumargaonkar 's proposal https://github.com/envoyproxy/ai-gateway/blob/main/docs/proposals/004-vendor-specific-fields/proposal.md This is trying to define "AI Gateway Schema", and the scope and impact is huge. Could you write up a design doc and have more comments from community here? At the very least, I would request for very detailed documents and future planning on this direction as well as this kind of thing requires user facing documentation in the website, not just code. sg? |
|
Please note that breaking change regarding this is extremely impactful than the control plane API. Once this is exposed, you cannot assume that clients will migrate properly, so we have to be very careful |
|
@mathetake Sure, thanks a lot for the comments! One key motivation is that we do not want to break the existing codes internally. Would prepare a doc also. Thanks! |
@mathetake It is not against @sukumargaonkar's original proposal for vendor specific fields. However for thinking we found out that different providers have very similar looking definitions, so it is worth unifying at the gateway level. @hustxiayang will provider a detailed writeup. |
agreed, I think we can still keep the GCP thinking vendor field but recommend the new unified thinking API shape. |
|
ah sorry i should clarify that my comment about breaking change is about this new "unified api portion", not about vendor stuff already in place, and that's why I am saying we should not take this addition lightly without any very careful consideration. |
|
basically API added here in this PRs will never be reverted without very long migration period (think about openai/anthropic, or any other public API's deprecation window) |
Adding an agenda tomorrow to discuss the deprecation policy. |
|
Premature abstractions can come at a significant maintenance cost, so we need proper, informed decisions before pushing for them.
This PR description is quite vague and doesn't give a good understanding on what will be covered in reality and the real user benefits. Please edit the description and add a detailed breakdown of the providers that are supported today and how their APIs look like. Let's include all all the currently supported providers, so we can get a good understanding on what this change would really cover (because not all of them support the same features through the OpenAI comapt APIs. For example, Cohere does not support the That would help. lot understand what this really covers and what it's optimizing for. |
|
yeah so i won't be able to make it to the meeting tomorrow (sorry 🙏) but as long as the deprecation policy is agreed upon in a sane way as well as the API design is reviewed by multiple people given the prior art research (not limited to litellm), then all good to me. of course the user facing documentation is a must but that's after the agreement we can do that |
|
@mathetake @yuzisun @nacx It's been a while after the discussion. Any plans/concerns to merge this PR? |
#1554) **Description** Some new features were introduced in gemini3: **1 thinking_level:** https://ai.google.dev/gemini-api/docs/gemini-3?thinking=low#thinking_level This is similar to reasoning_effort of openai, thus, unified them. **2 media_resolution** https://ai.google.dev/gemini-api/docs/gemini-3?thinking=low#media_resolution This is similar to detail in openai, thus, unified them. The difference is that openai does not provide a global config of media_resolution. Thus, added it as gcp specific, but still use detail to make the name consistent. **Some related PRs:** thinking_budget is in #1461 thinking_level and thinking_budget are both supported, but can not use them together. Other features under review: **1 web search:** #1526 **2 parse the thought summary:** #1521 --------- Signed-off-by: yxia216 <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: Sukumar Gaonkar <[email protected]> Co-authored-by: Alexa Griffith <[email protected]> Co-authored-by: Sukumar Gaonkar <[email protected]> Co-authored-by: Ignasi Barrera <[email protected]>
Signed-off-by: yxia216 <[email protected]>
Signed-off-by: yxia216 <[email protected]>
Signed-off-by: yxia216 <[email protected]>
Signed-off-by: yxia216 <[email protected]>
Signed-off-by: yxia216 <[email protected]>
ab84e26 to
fb92a6d
Compare
Signed-off-by: yxia216 <[email protected]>
|
@yuzisun @hustxiayang this needs the end user documentation change https://aigateway.envoyproxy.io/docs/capabilities/llm-integrations/vendor-specific-fields |
Updated #1590 |
…lude thought is true (#1521) **Description** When `includeThought` is true, Gemini would also generate summary of thinking process. We need to parse out this kind of data to users. Otherwise, we would return thought process together with output to users. Depends/base or replace #1461 --------- Signed-off-by: yxia216 <[email protected]>
Description
thinking_configamong anthropic and gemini models are similar, for example:thinkingandthinking_config;budget_tokensandthinkingBudget. Our users do not need to set up different thinking configs for different models as we can provide a unified interface among different providers.Related Issues/PRs (if applicable)
Related to #1463