-
Notifications
You must be signed in to change notification settings - Fork 58
add validation checking for azure deployment config #243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add validation checking for azure deployment config #243
Conversation
@jerichosiahaya can you use the poetry env to run the formatter/linter? other details outlined here: https://github.com/redis/redis-vl-python/blob/main/CONTRIBUTING.md#linting-and-tests |
We want to bring this in, but more updates are needed. I will port your commits to another branch and use that. Stay tuned. |
Alright, what do I need to update? |
@tylerhutcherson did this get implemented in another branch? |
Hi @jerichosiahaya, Thank you so much for taking the time to contribute to RedisVL! We really appreciate community members who notice areas where our documentation could be clearer and take the initiative to help improve the user experience. Why we can't merge this PR After careful review, rebasing against main, and extensive testing, I've identified several fundamental issues with this approach that prevent us from merging it:
The core problem is that the PR extracts azure_deployment from api_config and validates its presence, but then completely discards the value. Looking at the code: azure_deployment = (
api_config.pop("azure_deployment")
if api_config
else os.getenv("AZURE_OPENAI_DEPLOYMENT")
)
if not azure_deployment:
raise ValueError(...) After this validation, azure_deployment is never passed to any client constructor, never stored as an instance variable, and never used anywhere. The variable just disappears.
The Azure OpenAI deployment name isn't a parameter to the AzureOpenAI client constructor - it's passed as the model parameter to the embeddings.create() method calls. Our current implementation
The deployment name is already a required parameter - just not where this PR looks for it.
This change would require users to provide the deployment name in two places:
This would break all existing code using the vectorizer correctly, including our own test fixtures: # From tests/integration/test_vectorizers.py:66-69
AzureOpenAITextVectorizer(
model=os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME", "text-embedding-ada-002")
) This would fail with the new validation if AZURE_OPENAI_DEPLOYMENT isn't set, even though the deployment name is correctly provided via model.
All tests pass locally (and likely in your environment), but this is misleading. The Azure OpenAI vectorizer tests are marked with @pytest.mark.requires_api_keys and are automatically skipped when
The PR updates the docstring to mention "Azure deployment" in api_config, but this creates confusion because:
What the real problem might be If users are getting confused about the Azure deployment configuration, the root cause is likely:
Better alternatives to consider If you'd like to continue contributing in this area, here are some ideas that would genuinely help users:
I'd be happy to discuss any of these approaches with you if you're interested in pursuing them! Summary The issue isn't the intent (making deployment configuration clearer) but the implementation (validating an unused parameter that creates breaking changes). The current code already requires and Thanks again for your contribution and for caring about user experience. We hope you'll continue contributing to RedisVL! |
Add a more clear validation on AzureOpenAITextVectorizer config for azure_deployment, so it won't make people confused since it doesn't state that azure_deployment is a required field.