[OCT] Alexsander PERF improvements #15617

ishaan-jaff · 2025-10-16T22:09:28Z

Title

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
I have added a screenshot of my new test passing locally
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

…nsformation.py

- Update to 2 instance baseline (1035 RPS @ 200ms median) - Add 4 instance results (1170 RPS @ 100ms median) - Update machine specs to 4 CPU / 8GB RAM - Update Locust settings to 1000 users

Replace time.time() with more appropriate timing functions for better performance and reliability: - Use time.perf_counter() for duration measurements in acompletion(), _acompletion(), and async_get_available_deployment() - Use time.monotonic() for timeout calculations in scheduler methods (schedule_acompletion and _schedule_factory) Benefits: - 30-40% faster timing calls (~300ns savings per call) - time.monotonic() provides reliable timeouts unaffected by system clock changes (NTP adjustments, DST, manual time changes) - time.perf_counter() offers highest resolution for performance metrics - Follows Python best practices for timing operations

Add `ssl_ecdh_curve` setting to configure TLS key exchange curve. Allows disabling PQC on OpenSSL 3.x for better performance. Configurable via SDK (litellm.ssl_ecdh_curve), YAML (litellm_settings), or env var (SSL_ECDH_CURVE). Common curves: X25519, prime256v1, secp384r1.

* fix: add google_routes to llm_api_routes * test: test_virtual_key_llm_api_routes_allows_google_routes

…dize timing - Use model_id_to_deployment_index_map and model_name_to_deployment_indices for O(1) lookups in get_model_info, get_deployment_by_model_group_name, and get_model_ids

Replace string concatenation in loop with list append + join pattern. This improves time complexity from O(n²) to O(n) and avoids creating many temporary string objects during hash ID generation.

…mization Update test_generate_model_id_with_deployment_model_name to accept the new error message format that results from the list+join optimization. The function still correctly rejects None values with a TypeError, but the error message changed from 'unsupported operand type(s) for +=' to 'expected str instance, NoneType found' due to the implementation change from string concatenation to list joining.

Replace copy.deepcopy() with dict.copy() in _get_all_deployments when creating model aliases. Safe because: 1. Only modifies top-level 'model_name' field (isolated by shallow copy) 2. Nested dicts (litellm_params, model_info) are never modified after return 3. When model_alias=None, returns original dict with NO copy, proving callers expect nested structures to be read-only 4. Key insight: Code that needs to modify nested structures does deepcopy FIRST. This proves the contract is: \"treat returned deployments as read-only for nested fields.\" (see line 6894: copy.deepcopy before modifying litellm_params) Performance: ~10-100x faster than deepcopy on nested dict structures. Tests: All 114 router unit tests pass.

Replace O(n) scans with index map lookups and convert model_names to set. Standardize timing calls.

perf(router): use shallow copy instead of deepcopy for model aliases

…operation perf(router): optimize string concatenation in hash generation

perf(router): optimize model lookups with O(1) data structures

Add AST-based test to detect 'for ... in self.model_list' anti-pattern. Enforces use of index maps (model_id_to_deployment_index_map and model_name_to_deployment_indices) for O(1) lookups instead of O(n) iteration.

…r_index_change

perf(router): optimize model lookups with O(1) index maps

Fixes mypy type error where router_model_names (a set) was incompatible with List[str] parameter type. Changed check_file_size_under_limit to accept Collection[str] instead of List[str] since the parameter is only used for membership checks, making it more flexible and avoiding unnecessary type conversions. Resolves: proxy/proxy_server.py:5107 arg-type error

[Fix] CI/CD - Missing env key & Linter type error

…dize timing - Use model_id_to_deployment_index_map and model_name_to_deployment_indices for O(1) lookups in get_model_info, get_deployment_by_model_group_name, and get_model_ids

Add AST-based test to detect 'for ... in self.model_list' anti-pattern. Enforces use of index maps (model_id_to_deployment_index_map and model_name_to_deployment_indices) for O(1) lookups instead of O(n) iteration.

Replace copy.deepcopy() with dict.copy() in _get_all_deployments when creating model aliases. Safe because: 1. Only modifies top-level 'model_name' field (isolated by shallow copy) 2. Nested dicts (litellm_params, model_info) are never modified after return 3. When model_alias=None, returns original dict with NO copy, proving callers expect nested structures to be read-only 4. Key insight: Code that needs to modify nested structures does deepcopy FIRST. This proves the contract is: \"treat returned deployments as read-only for nested fields.\" (see line 6894: copy.deepcopy before modifying litellm_params) Performance: ~10-100x faster than deepcopy on nested dict structures. Tests: All 114 router unit tests pass.

Replace string concatenation in loop with list append + join pattern. This improves time complexity from O(n²) to O(n) and avoids creating many temporary string objects during hash ID generation.

…mization Update test_generate_model_id_with_deployment_model_name to accept the new error message format that results from the list+join optimization. The function still correctly rejects None values with a TypeError, but the error message changed from 'unsupported operand type(s) for +=' to 'expected str instance, NoneType found' due to the implementation change from string concatenation to list joining.

Replace O(n) scans with index map lookups and convert model_names to set. Standardize timing calls.

Fixes mypy type error where router_model_names (a set) was incompatible with List[str] parameter type. Changed check_file_size_under_limit to accept Collection[str] instead of List[str] since the parameter is only used for membership checks, making it more flexible and avoiding unnecessary type conversions. Resolves: proxy/proxy_server.py:5107 arg-type error

…com/BerriAI/litellm into litellm_october_alexsander_stanging

vercel · 2025-10-16T22:09:34Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
litellm	Ready	Preview	Comment	Oct 16, 2025 10:09pm

huangyafei and others added 30 commits October 14, 2025 13:57

Add anthropic/claude-sonnet-4.5 to OpenRouter cost map

9a980f3

add ecs to docs

6424480

fix(ollama/chat): 'think' param handling

4f33bc4

fix: add 'think' parameter handling in ollama_chat.py

e30b716

fix: update 'think' parameter assignment to use provided value in tra…

9557a37

…nsformation.py

fix: only use think level for gpt-oss model

6896c60

fix comment

92bd238

docs: update benchmark results with improved infrastructure

6df051e

- Update to 2 instance baseline (1035 RPS @ 200ms median) - Add 4 instance results (1170 RPS @ 100ms median) - Update machine specs to 4 CPU / 8GB RAM - Update Locust settings to 1000 users

update benchmarks

2e57d19

direct cost calculation from openrouter

09662b5

add tests for openrouter cost tracking

b57406e

docs: fix doc

cb29e33

docs(index.md): bump rc

b0d963c

[Fix] GEMINI - CLI - add google_routes to llm_api_routes (#15500)

65163c7

* fix: add google_routes to llm_api_routes * test: test_virtual_key_llm_api_routes_allows_google_routes

add: unit test

d094a33

perf(router): optimize model lookups with O(1) index maps and standar…

36c9066

…dize timing - Use model_id_to_deployment_index_map and model_name_to_deployment_indices for O(1) lookups in get_model_info, get_deployment_by_model_group_name, and get_model_ids

perf(router): optimize string concatenation in hash generation

c98a30a

Replace string concatenation in loop with list append + join pattern. This improves time complexity from O(n²) to O(n) and avoids creating many temporary string objects during hash ID generation.

perf(router): optimize model lookups with O(1) data structures

6e4da1d

Replace O(n) scans with index map lookups and convert model_names to set. Standardize timing calls.

Merge pull request #15576 from BerriAI/litellm_remove_deepcopy

1705190

perf(router): use shallow copy instead of deepcopy for model aliases

Merge pull request #15575 from BerriAI/litellm_remove_constly_string_…

4e84937

…operation perf(router): optimize string concatenation in hash generation

test: ensure model_names is a O(1) datastructure

56838e2

Merge pull request #15578 from BerriAI/litellm_remove_list_lookup

a17f3a7

perf(router): optimize model lookups with O(1) data structures

test: add static analysis to prevent O(n) linear scans in router

5e929da

Add AST-based test to detect 'for ... in self.model_list' anti-pattern. Enforces use of index maps (model_id_to_deployment_index_map and model_name_to_deployment_indices) for O(1) lookups instead of O(n) iteration.

Merge branch 'litellm_october_alexsander_stanging' into litellm_route…

1846363

…r_index_change

Merge pull request #15574 from BerriAI/litellm_router_index_change

d07e4f4

perf(router): optimize model lookups with O(1) index maps

Add missing env key

18c5ec8

AlexsanderHamir added 12 commits October 16, 2025 11:33

Merge pull request #15606 from BerriAI/litellm_fix_staging_branch

2ef3a76

[Fix] CI/CD - Missing env key & Linter type error

perf(router): optimize model lookups with O(1) index maps and standar…

254f50a

…dize timing - Use model_id_to_deployment_index_map and model_name_to_deployment_indices for O(1) lookups in get_model_info, get_deployment_by_model_group_name, and get_model_ids

test: add static analysis to prevent O(n) linear scans in router

35de05d

Add AST-based test to detect 'for ... in self.model_list' anti-pattern. Enforces use of index maps (model_id_to_deployment_index_map and model_name_to_deployment_indices) for O(1) lookups instead of O(n) iteration.

perf(router): optimize string concatenation in hash generation

f842b2a

Replace string concatenation in loop with list append + join pattern. This improves time complexity from O(n²) to O(n) and avoids creating many temporary string objects during hash ID generation.

perf(router): optimize model lookups with O(1) data structures

3b14866

Replace O(n) scans with index map lookups and convert model_names to set. Standardize timing calls.

test: ensure model_names is a O(1) datastructure

e154ab6

Add missing env key

220fa3f

Merge branch 'litellm_october_alexsander_stanging' of https://github.…

bdca4db

…com/BerriAI/litellm into litellm_october_alexsander_stanging

ishaan-jaff merged commit f98f299 into main Oct 16, 2025
41 of 49 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[OCT] Alexsander PERF improvements #15617

[OCT] Alexsander PERF improvements #15617

Uh oh!

ishaan-jaff commented Oct 16, 2025

Uh oh!

vercel bot commented Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Uh oh!

[OCT] Alexsander PERF improvements #15617

[OCT] Alexsander PERF improvements #15617

Uh oh!

Conversation

ishaan-jaff commented Oct 16, 2025

Title

Relevant issues

Pre-Submission checklist

Type

Changes

Uh oh!

vercel bot commented Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants