Skip to content

Commit 11b9726

Browse files
authored
bump version to v0.10.0 (#3933)
* bump version to v0.10.0 * update readme * update readme * update supported models * note about 0.10.0 on cuda12.8
1 parent fd68602 commit 11b9726

File tree

8 files changed

+13
-11
lines changed

8 files changed

+13
-11
lines changed

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ ______________________________________________________________________
2626
<details open>
2727
<summary><b>2025</b></summary>
2828

29+
- \[2025/09\] TurboMind supports MXFP4 on NVIDIA GPUs starting from V100, achieving 1.5x the performmance of vLLM on H800 for openai gpt-oss models!
2930
- \[2025/06\] Comprehensive inference optimization for FP8 MoE Models
3031
- \[2025/06\] DeepSeek PD Disaggregation deployment is now supported through integration with [DLSlime](https://github.com/DeepLink-org/DLSlime) and [Mooncake](https://github.com/kvcache-ai/Mooncake). Huge thanks to both teams!
3132
- \[2025/04\] Enhance DeepSeek inference performance by integration deepseek-ai techniques: FlashMLA, DeepGemm, DeepEP, MicroBatch and eplb
@@ -149,6 +150,7 @@ LMDeploy is a toolkit for compressing, deploying, and serving LLM, developed by
149150
<li>Phi-3.5-MoE (16x3.8B)</li>
150151
<li>Phi-4-mini (3.8B)</li>
151152
<li>MiniCPM3 (4B)</li>
153+
<li>gpt-oss (20B, 120B)</li>
152154
</ul>
153155
</td>
154156
<td>
@@ -204,7 +206,7 @@ conda activate lmdeploy
204206
pip install lmdeploy
205207
```
206208

207-
The default prebuilt package is compiled on **CUDA 12** since v0.3.0.
209+
The default prebuilt package is compiled on **CUDA 12.8** since v0.10.0.
208210
For more information on installing on CUDA 11+ platform, or for instructions on building from source, please refer to the [installation guide](docs/en/get_started/installation.md).
209211

210212
## Offline Batch Inference

README_ja.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -23,10 +23,6 @@ ______________________________________________________________________
2323

2424
## 最新ニュース 🎉
2525

26-
<details open>
27-
<summary><b>2025</b></summary>
28-
</details>
29-
3026
<details close>
3127
<summary><b>2024</b></summary>
3228

README_zh-CN.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ ______________________________________________________________________
2727
<summary><b>2025</b></summary>
2828
</details>
2929

30+
- 【2025年9月】TurboMind 引擎支持 MXFP4,适用于 NVIDIA V100 及以上 GPU。在 H800 上推理 openai gpt-oss 模型,性能可达 vLLM 的 1.5倍!
3031
- 【2025年6月】深度优化 FP8 MoE 模型推理
3132
- 【2025年6月】集成[DLSlime](https://github.com/DeepLink-org/DLSlime)[Mooncake](https://github.com/kvcache-ai/Mooncake),实现DeepSeek PD分离部署,向两个团队表示诚挚的感谢!
3233
- 【2025年4月】集成deepseek-ai组件FlashMLA、DeepGemm、DeepEP、MicroBatch、eplb等,提升DeepSeek推理性能
@@ -150,6 +151,7 @@ LMDeploy TurboMind 引擎拥有卓越的推理能力,在各种规模的模型
150151
<li>Phi-3.5-MoE (16x3.8B)</li>
151152
<li>Phi-4-mini (3.8B)</li>
152153
<li>MiniCPM3 (4B)</li>
154+
<li>gpt-oss (20B, 120B)</li>
153155
</ul>
154156
</td>
155157
<td>
@@ -205,7 +207,7 @@ conda activate lmdeploy
205207
pip install lmdeploy
206208
```
207209

208-
自 v0.3.0 起,LMDeploy 预编译包默认基于 CUDA 12 编译。如果需要在 CUDA 11+ 下安装 LMDeploy,或者源码安装 LMDeploy,请参考[安装文档](docs/zh_cn/get_started/installation.md)
210+
自 v0.10.0 起,LMDeploy 预编译包默认基于 CUDA 12.8 编译。如果需要在 CUDA 11+ 下安装 LMDeploy,或者源码安装 LMDeploy,请参考[安装文档](docs/zh_cn/get_started/installation.md)
209211

210212
## 离线批处理
211213

docs/en/get_started/installation.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ pip install lmdeploy
2323
The default prebuilt package is compiled on **CUDA 12**. If CUDA 11+ (>=11.3) is required, you can install lmdeploy by:
2424

2525
```shell
26-
export LMDEPLOY_VERSION=0.9.2
26+
export LMDEPLOY_VERSION=0.10.0
2727
export PYTHON_VERSION=310
2828
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118
2929
```
@@ -51,7 +51,7 @@ DISABLE_TURBOMIND=1 pip install git+https://github.com/InternLM/lmdeploy.git
5151
If you prefer a specific version instead of the `main` branch of LMDeploy, you can specify it in your command:
5252

5353
```shell
54-
pip install https://github.com/InternLM/lmdeploy/archive/refs/tags/v0.9.2.zip
54+
pip install https://github.com/InternLM/lmdeploy/archive/refs/tags/v0.10.0.zip
5555
```
5656

5757
If you want to build LMDeploy with support for Ascend, Cambricon, or MACA, install LMDeploy with the corresponding `LMDEPLOY_TARGET_DEVICE` environment variable.

docs/en/supported_models/supported_models.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ The following tables detail the models supported by LMDeploy's TurboMind engine
4747
| GLM4 | 9B | LLM | Yes | Yes | Yes | Yes |
4848
| CodeGeeX4 | 9B | LLM | Yes | Yes | Yes | - |
4949
| Molmo | 7B-D,72B | MLLM | Yes | Yes | Yes | No |
50+
| gpt-oss | 20B,120B | LLM | Yes | Yes | Yes | Yes |
5051

5152
"-" means not verified yet.
5253

docs/zh_cn/get_started/installation.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ pip install lmdeploy
2323
默认的预构建包是在 **CUDA 12** 上编译的。如果需要 CUDA 11+ (>=11.3),你可以使用以下命令安装 lmdeploy:
2424

2525
```shell
26-
export LMDEPLOY_VERSION=0.9.2
26+
export LMDEPLOY_VERSION=0.10.0
2727
export PYTHON_VERSION=310
2828
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118
2929
```
@@ -51,7 +51,7 @@ DISABLE_TURBOMIND=1 pip install git+https://github.com/InternLM/lmdeploy.git
5151
如果您希望使用特定版本,而不是 LMDeploy 的 `main` 分支,可以在命令行中指定:
5252

5353
```shell
54-
pip install https://github.com/InternLM/lmdeploy/archive/refs/tags/v0.9.2.zip
54+
pip install https://github.com/InternLM/lmdeploy/archive/refs/tags/v0.10.0.zip
5555
```
5656

5757
如果您希望构建支持昇腾、寒武纪或沐熙的 LMDeploy,请使用相应的 `LMDEPLOY_TARGET_DEVICE` 环境变量进行安装。

docs/zh_cn/supported_models/supported_models.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@
4747
| GLM4 | 9B | LLM | Yes | Yes | Yes | Yes |
4848
| CodeGeeX4 | 9B | LLM | Yes | Yes | Yes | - |
4949
| Molmo | 7B-D,72B | MLLM | Yes | Yes | Yes | No |
50+
| gpt-oss | 20B,120B | LLM | Yes | Yes | Yes | Yes |
5051

5152
“-” 表示还没有验证。
5253

lmdeploy/version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Copyright (c) OpenMMLab. All rights reserved.
22
from typing import Tuple
33

4-
__version__ = '0.9.2'
4+
__version__ = '0.10.0'
55
short_version = __version__
66

77

0 commit comments

Comments
 (0)