Skip to content

Commit 850e6e9

Browse files
Milvus-doc-botMilvus-doc-bot
authored andcommitted
Translate blogs
1 parent 946ba87 commit 850e6e9

File tree

25 files changed

+3192
-1
lines changed

25 files changed

+3192
-1
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"codeList":["wget https://github.com/Milvus-io/Milvus/releases/download/v2.5.12/Milvus-standalone-docker-compose.yml -O docker-compose.yml\n","docker-compose up -d\n","docker-compose ps -a\n","git clone https://github.com/vllm-project/semantic-router.git\n","cd semantic-router\nmake download-models\n","vim config.yaml\n","semantic_cache:\n enabled: true\n backend_type: \"milvus\" # Options: \"memory\" or \"milvus\"\n backend_config_path: \"config/cache/milvus.yaml\"\n similarity_threshold: 0.8\n max_entries: 1000 # Only applies to memory backend\n ttl_seconds: 3600\n eviction_policy: \"fifo\"\n","vim milvus.yaml\n","# Milvus connection settings\nconnection:\n # Milvus server host (change for production deployment)\n host: \"192.168.7.xxx\" # For production: use your Milvus cluster endpoint\n # Milvus server port\n port: 19530 # Standard Milvus port\n # Database name (optional, defaults to \"default\")\n database: \"default\"\n # Connection timeout in seconds\n timeout: 30\n # Authentication (enable for production)\n auth:\n enabled: false # Set to true for production\n username: \"\" # Your Milvus username\n password: \"\" # Your Milvus password\n # TLS/SSL configuration (recommended for production)\n tls:\n enabled: false # Set to true for secure connections\n cert_file: \"\" # Path to client certificate\n key_file: \"\" # Path to client private key\n ca_file: \"\" # Path to CA certificate\n# Collection settings\ncollection:\n # Name of the collection to store cache entries\n name: \"semantic_cache\"\n # Description of the collection\n description: \"Semantic cache for LLM request-response pairs\"\n # Vector field configuration\n vector_field:\n # Name of the vector field\n name: \"embedding\"\n # Dimension of the embeddings (auto-detected from model at runtime)\n dimension: 384 # This value is ignored - dimension is auto-detected from the embedding model\n # Metric type for similarity calculation\n metric_type: \"IP\" # Inner Product (cosine similarity for normalized vectors)\n # Index configuration for the vector field\n index:\n # Index type (HNSW is recommended for most use cases)\n type: \"HNSW\"\n # Index parameters\n params:\n M: 16 # Number of bi-directional links for each node\n efConstruction: 64 # Search scope during index construction\n","docker compose --profile testing up --build\n","echo \"=== 第一次请求(无缓存状态) ===\" && \\\ntime curl -X POST http://localhost:8801/v1/chat/completions \\\n -H \"Content-Type: application/json\" \\\n -H \"Authorization: Bearer test-token\" \\\n -d '{\n \"model\": \"auto\",\n \"messages\": [\n {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n {\"role\": \"user\", \"content\": \"What are the main renewable energy sources?\"}\n ],\n \"temperature\": 0.7\n }' | jq .\n","real 0m16.546s\nuser 0m0.116s\nsys 0m0.033s\n","echo \"=== 第二次请求(缓存状态) ===\" && \\\ntime curl -X POST http://localhost:8801/v1/chat/completions \\\n -H \"Content-Type: application/json\" \\\n -H \"Authorization: Bearer test-token\" \\\n -d '{\n \"model\": \"auto\",\n \"messages\": [\n {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n {\"role\": \"user\", \"content\": \"What are the main renewable energy sources?\"}\n ],\n \"temperature\": 0.7\n }' | jq .\n","real 0m2.393s\nuser 0m0.116s\nsys 0m0.021s\n"],"headingContent":"","anchorList":[{"label":"ما هو الموجه الدلالي؟","href":"What-is-a-Semantic-Router","type":2,"isActive":false},{"label":"كيف يستخدم المطورون الموجه الدلالي + Milvus في الإنتاج","href":"How-Developers-Are-Using-Semantic-Router-+-Milvus-in-Production","type":2,"isActive":false},{"label":"كيفية اختبار التخزين المؤقت الدلالي بسرعة في الموجه الدلالي","href":"How-to-Quickly-Test-the-Semantic-Caching-in-the-Semantic-Router","type":2,"isActive":false},{"label":"الخلاصة","href":"Conclusion","type":2,"isActive":false}]}

localization/blog/ar/vllm-semantic-router-milvus-how-semantic-routing-and-caching-scale-ai-systems-the-smart-way.md

Lines changed: 265 additions & 0 deletions
Large diffs are not rendered by default.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"codeList":["wget https://github.com/Milvus-io/Milvus/releases/download/v2.5.12/Milvus-standalone-docker-compose.yml -O docker-compose.yml\n","docker-compose up -d\n","docker-compose ps -a\n","git clone https://github.com/vllm-project/semantic-router.git\n","cd semantic-router\nmake download-models\n","vim config.yaml\n","semantic_cache:\n enabled: true\n backend_type: \"milvus\" # Options: \"memory\" or \"milvus\"\n backend_config_path: \"config/cache/milvus.yaml\"\n similarity_threshold: 0.8\n max_entries: 1000 # Only applies to memory backend\n ttl_seconds: 3600\n eviction_policy: \"fifo\"\n","vim milvus.yaml\n","# Milvus connection settings\nconnection:\n # Milvus server host (change for production deployment)\n host: \"192.168.7.xxx\" # For production: use your Milvus cluster endpoint\n # Milvus server port\n port: 19530 # Standard Milvus port\n # Database name (optional, defaults to \"default\")\n database: \"default\"\n # Connection timeout in seconds\n timeout: 30\n # Authentication (enable for production)\n auth:\n enabled: false # Set to true for production\n username: \"\" # Your Milvus username\n password: \"\" # Your Milvus password\n # TLS/SSL configuration (recommended for production)\n tls:\n enabled: false # Set to true for secure connections\n cert_file: \"\" # Path to client certificate\n key_file: \"\" # Path to client private key\n ca_file: \"\" # Path to CA certificate\n# Collection settings\ncollection:\n # Name of the collection to store cache entries\n name: \"semantic_cache\"\n # Description of the collection\n description: \"Semantic cache for LLM request-response pairs\"\n # Vector field configuration\n vector_field:\n # Name of the vector field\n name: \"embedding\"\n # Dimension of the embeddings (auto-detected from model at runtime)\n dimension: 384 # This value is ignored - dimension is auto-detected from the embedding model\n # Metric type for similarity calculation\n metric_type: \"IP\" # Inner Product (cosine similarity for normalized vectors)\n # Index configuration for the vector field\n index:\n # Index type (HNSW is recommended for most use cases)\n type: \"HNSW\"\n # Index parameters\n params:\n M: 16 # Number of bi-directional links for each node\n efConstruction: 64 # Search scope during index construction\n","docker compose --profile testing up --build\n","echo \"=== 第一次请求(无缓存状态) ===\" && \\\ntime curl -X POST http://localhost:8801/v1/chat/completions \\\n -H \"Content-Type: application/json\" \\\n -H \"Authorization: Bearer test-token\" \\\n -d '{\n \"model\": \"auto\",\n \"messages\": [\n {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n {\"role\": \"user\", \"content\": \"What are the main renewable energy sources?\"}\n ],\n \"temperature\": 0.7\n }' | jq .\n","real 0m16.546s\nuser 0m0.116s\nsys 0m0.033s\n","echo \"=== 第二次请求(缓存状态) ===\" && \\\ntime curl -X POST http://localhost:8801/v1/chat/completions \\\n -H \"Content-Type: application/json\" \\\n -H \"Authorization: Bearer test-token\" \\\n -d '{\n \"model\": \"auto\",\n \"messages\": [\n {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n {\"role\": \"user\", \"content\": \"What are the main renewable energy sources?\"}\n ],\n \"temperature\": 0.7\n }' | jq .\n","real 0m2.393s\nuser 0m0.116s\nsys 0m0.021s\n"],"headingContent":"","anchorList":[{"label":"Was ist ein semantischer Router?","href":"What-is-a-Semantic-Router","type":2,"isActive":false},{"label":"Wie Entwickler Semantic Router + Milvus in der Produktion einsetzen","href":"How-Developers-Are-Using-Semantic-Router-+-Milvus-in-Production","type":2,"isActive":false},{"label":"Schnelles Testen des semantischen Cachings im Semantic Router","href":"How-to-Quickly-Test-the-Semantic-Caching-in-the-Semantic-Router","type":2,"isActive":false},{"label":"Fazit","href":"Conclusion","type":2,"isActive":false}]}

0 commit comments

Comments
 (0)