-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[OCT] Alexsander PERF improvements #15617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Update to 2 instance baseline (1035 RPS @ 200ms median) - Add 4 instance results (1170 RPS @ 100ms median) - Update machine specs to 4 CPU / 8GB RAM - Update Locust settings to 1000 users
Replace time.time() with more appropriate timing functions for better performance and reliability: - Use time.perf_counter() for duration measurements in acompletion(), _acompletion(), and async_get_available_deployment() - Use time.monotonic() for timeout calculations in scheduler methods (schedule_acompletion and _schedule_factory) Benefits: - 30-40% faster timing calls (~300ns savings per call) - time.monotonic() provides reliable timeouts unaffected by system clock changes (NTP adjustments, DST, manual time changes) - time.perf_counter() offers highest resolution for performance metrics - Follows Python best practices for timing operations
Add `ssl_ecdh_curve` setting to configure TLS key exchange curve. Allows disabling PQC on OpenSSL 3.x for better performance. Configurable via SDK (litellm.ssl_ecdh_curve), YAML (litellm_settings), or env var (SSL_ECDH_CURVE). Common curves: X25519, prime256v1, secp384r1.
* fix: add google_routes to llm_api_routes * test: test_virtual_key_llm_api_routes_allows_google_routes
…dize timing - Use model_id_to_deployment_index_map and model_name_to_deployment_indices for O(1) lookups in get_model_info, get_deployment_by_model_group_name, and get_model_ids
Replace string concatenation in loop with list append + join pattern. This improves time complexity from O(n²) to O(n) and avoids creating many temporary string objects during hash ID generation.
…mization Update test_generate_model_id_with_deployment_model_name to accept the new error message format that results from the list+join optimization. The function still correctly rejects None values with a TypeError, but the error message changed from 'unsupported operand type(s) for +=' to 'expected str instance, NoneType found' due to the implementation change from string concatenation to list joining.
Replace copy.deepcopy() with dict.copy() in _get_all_deployments when creating model aliases. Safe because: 1. Only modifies top-level 'model_name' field (isolated by shallow copy) 2. Nested dicts (litellm_params, model_info) are never modified after return 3. When model_alias=None, returns original dict with NO copy, proving callers expect nested structures to be read-only 4. Key insight: Code that needs to modify nested structures does deepcopy FIRST. This proves the contract is: \"treat returned deployments as read-only for nested fields.\" (see line 6894: copy.deepcopy before modifying litellm_params) Performance: ~10-100x faster than deepcopy on nested dict structures. Tests: All 114 router unit tests pass.
Replace O(n) scans with index map lookups and convert model_names to set. Standardize timing calls.
perf(router): use shallow copy instead of deepcopy for model aliases
…operation perf(router): optimize string concatenation in hash generation
perf(router): optimize model lookups with O(1) data structures
Add AST-based test to detect 'for ... in self.model_list' anti-pattern. Enforces use of index maps (model_id_to_deployment_index_map and model_name_to_deployment_indices) for O(1) lookups instead of O(n) iteration.
perf(router): optimize model lookups with O(1) index maps
Fixes mypy type error where router_model_names (a set) was incompatible with List[str] parameter type. Changed check_file_size_under_limit to accept Collection[str] instead of List[str] since the parameter is only used for membership checks, making it more flexible and avoiding unnecessary type conversions. Resolves: proxy/proxy_server.py:5107 arg-type error
[Fix] CI/CD - Missing env key & Linter type error
…dize timing - Use model_id_to_deployment_index_map and model_name_to_deployment_indices for O(1) lookups in get_model_info, get_deployment_by_model_group_name, and get_model_ids
Add AST-based test to detect 'for ... in self.model_list' anti-pattern. Enforces use of index maps (model_id_to_deployment_index_map and model_name_to_deployment_indices) for O(1) lookups instead of O(n) iteration.
Replace copy.deepcopy() with dict.copy() in _get_all_deployments when creating model aliases. Safe because: 1. Only modifies top-level 'model_name' field (isolated by shallow copy) 2. Nested dicts (litellm_params, model_info) are never modified after return 3. When model_alias=None, returns original dict with NO copy, proving callers expect nested structures to be read-only 4. Key insight: Code that needs to modify nested structures does deepcopy FIRST. This proves the contract is: \"treat returned deployments as read-only for nested fields.\" (see line 6894: copy.deepcopy before modifying litellm_params) Performance: ~10-100x faster than deepcopy on nested dict structures. Tests: All 114 router unit tests pass.
Replace string concatenation in loop with list append + join pattern. This improves time complexity from O(n²) to O(n) and avoids creating many temporary string objects during hash ID generation.
…mization Update test_generate_model_id_with_deployment_model_name to accept the new error message format that results from the list+join optimization. The function still correctly rejects None values with a TypeError, but the error message changed from 'unsupported operand type(s) for +=' to 'expected str instance, NoneType found' due to the implementation change from string concatenation to list joining.
Replace O(n) scans with index map lookups and convert model_names to set. Standardize timing calls.
Fixes mypy type error where router_model_names (a set) was incompatible with List[str] parameter type. Changed check_file_size_under_limit to accept Collection[str] instead of List[str] since the parameter is only used for membership checks, making it more flexible and avoiding unnecessary type conversions. Resolves: proxy/proxy_server.py:5107 arg-type error
…com/BerriAI/litellm into litellm_october_alexsander_stanging
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Title
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unitType
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes