-
Notifications
You must be signed in to change notification settings - Fork 418
Description
Problem Statement
Currently, the configuration for different model providers is inconsistent across the SDK. For example, AnthropicModel requires max_tokens as a top-level parameter in its constructor, while other providers like OpenAIModel expect it to be passed inside a generic params dictionary. This inconsistency creates a confusing user experience, making it difficult for developers to switch between different models without changing the structure of their code. This adds to long-term maintenance overhead as new providers are added.
Proposed Solution
My proposed solution is to introduce a standardized ModelConfig base class using Pydantic, creating a single source of truth for all common model parameters like temperature, max_tokens, and top_p. To avoid over-engineering and maintain flexibility, this design also includes a provider_params dictionary, which acts as an "escape hatch" for any unique, provider-specific settings, such as Gemini's safety_settings. This gives us the best of both worlds: consistency for common parameters and flexibility for unique ones.
To make this more concrete, I have already completed a proof-of-concept by refactoring the AnthropicModel. Here is a summary of the change to its init method, which becomes much cleaner and safer:
Before:
class AnthropicConfig(TypedDict, total=False):
max_tokens: Required[int]
model_id: Required[str]
params: Optional[dict[str, Any]]
def __init__(self, , client_args: Optional[dict[str, Any]] = None, *model_config: Unpack[AnthropicConfig]):
validate_config_keys(model_config, self.AnthropicConfig)
self.config = AnthropicModel.AnthropicConfig(**model_config)
# ...'
After:
#No more local AnthropicConfig class needed
def __init__(self, model_id: str, client_args: Optional[dict[str, Any]] = None, **kwargs: Any):
self.model_id = model_id
# Uses the new central ModelConfig object for validation and consistency
self.config = ModelConfig(**kwargs)
# ...
My proposed rollout plan is to submit a pull request that includes the new _base_config.py file, the completed refactoring for AnthropicModel, and a similar refactoring for OpenAIModel. This will establish the pattern for the change in a PR. After it is approved, the same pattern can be applied to the remaining model providers once approved.
Use Case
A developer should be able to switch from OpenAIModel to AnthropicModel by changing only the model class, without having to rewrite their configuration code for common parameters like temperature or max_tokens.
Alternatives Solutions
No response
Additional Context
No response