You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -693,7 +693,8 @@ A WebUI developed based on Gradio, with a simple interface and only core parsing
693
693
<sup>2</sup> Linux supports only distributions released in 2019 or later.
694
694
<sup>3</sup> MLX requires macOS 13.5 or later, recommended for use with version 14.0 or higher.
695
695
<sup>4</sup> Windows vLLM support via WSL2(Windows Subsystem for Linux).
696
-
<sup>5</sup> Servers compatible with the OpenAI API, such as local or remote model services deployed via inference frameworks like `vLLM`, `SGLang`, or `LMDeploy`.
696
+
<sup>5</sup> Windows LMDeploy can only use the `turbomind` backend, which is slightly slower than the `pytorch` backend. If performance is critical, it is recommended to run it via WSL2.
697
+
<sup>6</sup> Servers compatible with the OpenAI API, such as local or remote model services deployed via inference frameworks like `vLLM`, `SGLang`, or `LMDeploy`.
Copy file name to clipboardExpand all lines: docker/compose.yaml
+32-3Lines changed: 32 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
services:
2
2
mineru-vllm-server:
3
-
image: mineru-vllm:latest
3
+
image: mineru:latest
4
4
container_name: mineru-vllm-server
5
5
restart: always
6
6
profiles: ["vllm-server"]
@@ -28,8 +28,37 @@ services:
28
28
device_ids: ["0"]
29
29
capabilities: [gpu]
30
30
31
+
mineru-lmdeploy-server:
32
+
image: mineru:latest
33
+
container_name: mineru-lmdeploy-server
34
+
restart: always
35
+
profiles: [ "lmdeploy-server" ]
36
+
ports:
37
+
- 30000:30000
38
+
environment:
39
+
MINERU_MODEL_SOURCE: local
40
+
entrypoint: mineru-lmdeploy-server
41
+
command:
42
+
--host 0.0.0.0
43
+
--port 30000
44
+
# --dp 2 # If using multiple GPUs, increase throughput using lmdeploy's multi-GPU parallel mode
45
+
# --cache-max-entry-count 0.5 # If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by this parameter, if VRAM issues persist, try lowering it further to `0.4` or below.
0 commit comments