Skip to content
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
372b2c5
AUTOTEST: add device type for ascend
littlegy Aug 27, 2025
b9ad171
Merge branch 'InternLM:main' into test
littlegy Aug 28, 2025
28907ca
AOTOTEST: fix lint
littlegy Aug 29, 2025
ecee43a
Merge branch 'InternLM:main' into test
littlegy Aug 29, 2025
fb57ada
AUTOTEST: add pipeline test timeout
littlegy Aug 29, 2025
4869f64
AUTOTEST: fix lint flake8
littlegy Aug 29, 2025
d8f85d4
Create api_eva.yml
littlegy Aug 29, 2025
82188f7
WORKFLOW: add ascend workflow
littlegy Sep 4, 2025
ddaa36c
WORKFLOW: update ascend runner
littlegy Sep 4, 2025
2cda270
fix yaml
littlegy Sep 4, 2025
755cee6
fix yaml ii
littlegy Sep 4, 2025
28f4df6
fix yaml ii
littlegy Sep 4, 2025
0ef80d6
fix yaml ii
littlegy Sep 4, 2025
c6c6183
fix yaml ii
littlegy Sep 4, 2025
39b6dc3
fix yaml ii
littlegy Sep 4, 2025
88c8461
fix yaml ii
littlegy Sep 4, 2025
82cec68
update ascend
littlegy Sep 5, 2025
49b3d59
update ascend
littlegy Sep 5, 2025
4b591d4
update ascend
littlegy Sep 5, 2025
ee49e02
AUTOTEST: update hw yml
littlegy Sep 10, 2025
85bac1e
AUTOTEST: fix hw yml
littlegy Sep 10, 2025
06568e2
AUTOTEST: add ascend device
littlegy Sep 10, 2025
c1ae0b1
CI: fix yml
littlegy Sep 10, 2025
55b2f76
CI: add pip cache
littlegy Sep 10, 2025
b012c5a
CI: add pip cache
littlegy Sep 10, 2025
43e6666
CI: add pip cache
littlegy Sep 10, 2025
b3dcf40
CI: add pip cache
littlegy Sep 10, 2025
b8ace1d
Merge branch 'InternLM:main' into hw_runner
littlegy Sep 11, 2025
dd8bac7
TEST: update ascend test
littlegy Sep 11, 2025
9bda185
TEST: rm api eval
littlegy Sep 11, 2025
9ce6864
TEST: update chat test
littlegy Sep 11, 2025
4e62a33
TEST: fix tmp dir
littlegy Sep 11, 2025
ff2ed1d
Merge branch 'main' into hw_runner
littlegy Sep 12, 2025
858cb5a
TEST: fix lint
littlegy Sep 12, 2025
a786543
TEST: update ascend config
littlegy Sep 15, 2025
5e06c9f
TEST: rm ascend config
littlegy Sep 16, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 171 additions & 0 deletions .github/workflows/daily_ete_test_ascend.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
name: daily_ete_test_ascend

on:
workflow_dispatch:
inputs:
repo_org:
required: false
description: 'Tested repository organization name. Default is InternLM'
type: string
default: 'InternLM/lmdeploy'
repo_ref:
required: false
description: 'Set branch or tag or commit id. Default is "main"'
type: string
default: 'main'
backend:
required: true
description: 'Set backend testcase filter: turbomind or pytorch or turbomind, pytorch. Default is "["turbomind", "pytorch"]"'
type: string
default: "['pytorch']"
model:
required: true
description: 'Set testcase module filter: llm, vllm. Default contains all models'
type: string
default: "['llm']"
function:
required: true
description: 'Set testcase function filter: chat, restful, pipeline. Default contains all functions'
type: string
default: '["pipeline", "restful", "chat"]'
regression_func:
required: true
description: 'regression functions'
type: string
default: "['tools']"

env:
REPORT_DIR: /test/test-reports/${{ github.run_id }}
COV_PARAM: --cov /usr/local/python3.10.17/lib/python3.10/site-packages/lmdeploy
FAIL_CONFIG: ${{ github.event_name == 'push' && github.run_attempt != 1 && '--lf --lfnf none' || '--lf'}}
TEST_CODE_PATH: /test/test_pkg/lmdeploy/${{ github.run_id }}
LOG_PATH: /test/log/${{ github.run_id }}
TMPDIR: /mnt/deeplink/docker-tmp/qa_tmp
RAY_TMPDIR: /mnt/deeplink/docker-tmp/qa_tmp/ray

jobs:
download_pkgs:
if: ${{!cancelled()}}
runs-on: [self-hosted, ascend-013]
timeout-minutes: 50
container:
image: crpi-4crprmm5baj1v8iv.cn-hangzhou.personal.cr.aliyuncs.com/lmdeploy_dlinfer/ascend:910b-latest
options: "--device=/dev/davinci0 --device=/dev/davinci1 --device=/dev/davinci2 --device=/dev/davinci3 --device=/dev/davinci4 --device=/dev/davinci5 --device=/dev/davinci6 --device=/dev/davinci7 --device=/dev/davinci_manager --device=/dev/devmm_svm --device=/dev/hisi_hdc -e PIP_CACHE_DIR=/root/.cache/pip --shm-size=150g --pull never"
volumes:
- /usr/local/Ascend/driver:/usr/local/Ascend/driver:ro
- /usr/local/sbin:/usr/local/sbin:ro
- /var/log/npu/slog:/var/log/npu/slog
- /var/log/npu/profiling:/var/log/npu/profiling
- /var/log/npu/dump:/var/log/npu/dump
- /var/log/npu:/usr/slog
- /etc/hccn.conf:/etc/hccn.conf:ro
- /root/qa_test:/test
- /mnt:/mnt
- /root/.cache/pip:/root/.cache/pip
steps:
- name: Clone repository
uses: actions/checkout@v2
if: ${{ !cancelled() }}
with:
repository: ${{ github.event.inputs.repo_org || 'InternLM/lmdeploy' }}
ref: ${{github.event.inputs.repo_ref || 'main'}}
- name: Copy repository
if: ${{ !cancelled() }}
run: rm -rf ${{env.TEST_CODE_PATH}} && mkdir ${{env.TEST_CODE_PATH}} && cp -r . ${{env.TEST_CODE_PATH}}

test_tools:
if: ${{!cancelled() && contains(fromJSON(github.event.inputs.regression_func), 'tools')}}
runs-on: [self-hosted, ascend-013]
needs: download_pkgs
timeout-minutes: 300
strategy:
fail-fast: false
matrix:
backend: ${{ fromJSON(inputs.backend || '["turbomind", "pytorch"]')}}
model: ${{ fromJSON(inputs.model || '["llm", "mllm"]')}}
function: ${{ fromJSON(inputs.function || '["pipeline","restful","chat"]')}}
exclude:
- backend: turbomind
model: mllm
function: chat
- backend: pytorch
model: mllm
function: chat
container:
image: crpi-4crprmm5baj1v8iv.cn-hangzhou.personal.cr.aliyuncs.com/lmdeploy_dlinfer/ascend:910b-latest
options: "--device=/dev/davinci0 --device=/dev/davinci1 --device=/dev/davinci2 --device=/dev/davinci3 --device=/dev/davinci4 --device=/dev/davinci5 --device=/dev/davinci6 --device=/dev/davinci7 --device=/dev/davinci_manager --device=/dev/devmm_svm --device=/dev/hisi_hdc -e PIP_CACHE_DIR=/root/.cache/pip --shm-size=150g --pull never"
volumes:
- /usr/local/Ascend/driver:/usr/local/Ascend/driver
- /usr/local/sbin:/usr/local/sbin
- /var/log/npu/slog:/var/log/npu/slog
- /var/log/npu/profiling:/var/log/npu/profiling
- /var/log/npu/dump:/var/log/npu/dump
- /var/log/npu:/usr/slog
- /etc/hccn.conf:/etc/hccn.conf
- /root/qa_test:/test
- /mnt:/mnt
- /root/.cache/pip:/root/.cache/pip
steps:
- name: Copy repository and Artifacts
run: |
cp -r ${{ env.TEST_CODE_PATH }}/. .
- name: Install lmdeploy - offline
run: |
python3 -m pip install -r requirements_ascend.txt -i https://mirrors.aliyun.com/pypi/simple/
- name: Install lmdeploy - test
run: |
python3 -m pip install -r requirements/test.txt -i https://mirrors.aliyun.com/pypi/simple/
- name: Check env
run: |
python3 -m pip list
lmdeploy check_env
rm -rf allure-results
# remove tmp log in testcase
rm -rf ${{ env.LOG_PATH }}/*
mkdir ${{ env.REPORT_DIR }}/.pytest_cache -p
ln -s ${{ env.REPORT_DIR }}/.pytest_cache autotest
- name: Test lmdeploy - chat
continue-on-error: true
if: ${{ (matrix.backend == 'pytorch' || matrix.backend == 'turbomind') && matrix.model == 'llm' && matrix.function == 'chat' }}
run: |
pytest autotest/tools/chat/test_command_chat_hf_${{ matrix.backend }}.py -m 'gpu_num_1 and test_ascend' -n 8 --device ascend --alluredir=${{ env.REPORT_DIR }} ${{ env.COV_PARAM }} || true
mv .coverage ${{ env.REPORT_DIR }}/.coverage.$(date +'%Y%m%d%H%M%S') || true
pytest autotest/tools/chat/test_command_chat_hf_${{ matrix.backend }}.py -m 'gpu_num_2 and test_ascend' -n 4 --device ascend --alluredir=${{ env.REPORT_DIR }} ${{ env.COV_PARAM }} || true
mv .coverage ${{ env.REPORT_DIR }}/.coverage.$(date +'%Y%m%d%H%M%S')
pytest autotest/tools/chat/test_command_chat_hf_${{ matrix.backend }}.py -m 'gpu_num_4 and test_ascend' -n 2 --device ascend --alluredir=${{ env.REPORT_DIR }} ${{ env.COV_PARAM }} || true
mv .coverage ${{ env.REPORT_DIR }}/.coverage.$(date +'%Y%m%d%H%M%S')
pytest autotest/tools/chat/test_command_chat_hf_${{ matrix.backend }}.py -m 'gpu_num_8 and test_ascend' --device ascend --alluredir=${{ env.REPORT_DIR }} ${{ env.COV_PARAM }} || true
mv .coverage ${{ env.REPORT_DIR }}/.coverage.$(date +'%Y%m%d%H%M%S')
- name: Test lmdeploy - pipeline
continue-on-error: true
if: ${{ matrix.function == 'pipeline' }}
run: |
pytest autotest/tools/pipeline/test_pipeline_chat_${{ matrix.backend }}_${{ matrix.model }}.py -m 'gpu_num_1 and test_ascend' -n 8 --device ascend --alluredir=${{ env.REPORT_DIR }} ${{ env.COV_PARAM }} || true
mv .coverage ${{ env.REPORT_DIR }}/.coverage.$(date +'%Y%m%d%H%M%S') || true
pytest autotest/tools/pipeline/test_pipeline_chat_${{ matrix.backend }}_${{ matrix.model }}.py -m 'gpu_num_2 and test_ascend' -n 4 --device ascend --alluredir=${{ env.REPORT_DIR }} ${{ env.COV_PARAM }} || true
mv .coverage ${{ env.REPORT_DIR }}/.coverage.$(date +'%Y%m%d%H%M%S')
pytest autotest/tools/pipeline/test_pipeline_chat_${{ matrix.backend }}_${{ matrix.model }}.py -m 'gpu_num_4 and test_ascend' -n 2 --device ascend --alluredir=${{ env.REPORT_DIR }} ${{ env.COV_PARAM }} || true
mv .coverage ${{ env.REPORT_DIR }}/.coverage.$(date +'%Y%m%d%H%M%S')
pytest autotest/tools/pipeline/test_pipeline_chat_${{ matrix.backend }}_${{ matrix.model }}.py -m 'gpu_num_8 and test_ascend' --device ascend --alluredir=${{ env.REPORT_DIR }} ${{ env.COV_PARAM }} || true
mv .coverage ${{ env.REPORT_DIR }}/.coverage.$(date +'%Y%m%d%H%M%S')
- name: Test lmdeploy - restful
continue-on-error: true
if: ${{ matrix.function == 'restful' }}
run: |
pytest autotest/tools/restful/test_restful_chat_hf_${{ matrix.backend }}_${{ matrix.model }}.py -m 'gpu_num_1 and test_ascend' -n 8 --device ascend --alluredir=${{ env.REPORT_DIR }} ${{ env.COV_PARAM }} || true
mv .coverage ${{ env.REPORT_DIR }}/.coverage.$(date +'%Y%m%d%H%M%S') || true
pytest autotest/tools/restful/test_restful_chat_hf_${{ matrix.backend }}_${{ matrix.model }}.py -m 'gpu_num_2 and test_ascend' -n 4 --device ascend --alluredir=${{ env.REPORT_DIR }} ${{ env.COV_PARAM }} || true
mv .coverage ${{ env.REPORT_DIR }}/.coverage.$(date +'%Y%m%d%H%M%S')
pytest autotest/tools/restful/test_restful_chat_hf_${{ matrix.backend }}_${{ matrix.model }}.py -m 'gpu_num_4 and test_ascend' -n 2 --device ascend --alluredir=${{ env.REPORT_DIR }} ${{ env.COV_PARAM }} || true
mv .coverage ${{ env.REPORT_DIR }}/.coverage.$(date +'%Y%m%d%H%M%S')
pytest autotest/tools/restful/test_restful_chat_hf_${{ matrix.backend }}_${{ matrix.model }}.py -m 'gpu_num_8 and test_ascend' --device ascend --alluredir=${{ env.REPORT_DIR }} ${{ env.COV_PARAM }} || true
mv .coverage ${{ env.REPORT_DIR }}/.coverage.$(date +'%Y%m%d%H%M%S')
- name: Clear workfile
if: always()
run: |
chmod -R 777 $REPORT_DIR
export workdir=$(pwd)
cd ..
rm -rf $workdir
mkdir $workdir
chmod -R 777 $workdir
10 changes: 8 additions & 2 deletions autotest/benchmark/test_throughput_performance.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import os

import pytest
from utils.benchmark_utils import throughput_test
from utils.config_utils import get_benchmark_model_list, get_cuda_id_by_workerid, get_cuda_prefix_by_workerid
Expand Down Expand Up @@ -92,11 +94,15 @@ def test_throughput_func_tp2(config, run_id, run_config, worker_id):
'tp_num': 1
}])
def test_throughput_prtest_tp1(config, run_id, run_config, worker_id):
device_type = os.environ.get('DEVICE', 'cuda')
if device_type == 'ascend':
env_var = 'ASCEND_RT_VISIBLE_DEVICES='
else:
env_var = 'CUDA_VISIBLE_DEVICES='
result, msg = throughput_test(config,
run_id,
run_config,
cuda_prefix='CUDA_VISIBLE_DEVICES=' +
str(int(get_cuda_id_by_workerid(worker_id)) + 5),
cuda_prefix=f'{env_var}' + str(int(get_cuda_id_by_workerid(worker_id)) + 5),
worker_id=worker_id,
is_smoke=True)

Expand Down
35 changes: 35 additions & 0 deletions autotest/config-ascend.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
model_path: /mnt/deeplink/group01/deeplink-test/weight
resource_path: /nvme/qa_test_models/resource
dst_path: /nvme/qa_test_models/autotest_model
log_path: /test/log
benchmark_path: /nvme/qa_test_models/benchmark-reports
dataset_path: /nvme/qa_test_models/datasets/ShareGPT_V3_unfiltered_cleaned_split.json
env_tag: a100
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this tag used for?


tp_config:
Qwen2.5-32B-Instruct: 2

pytorch_chat_model:
- /Qwen3-0.6B

pytorch_vl_model:
- /Qwen3-0.6B

pytorch_base_model:
- /Qwen3-0.6B

pytorch_quatization:
awq:
- meta-llama/Meta-Llama-3-8B-Instruct
w8a8:
- meta-llama/Meta-Llama-3-8B-Instruct
no_kvint4:
- /Qwen3-0.6B
no_kvint8:
- /Qwen3-0.6B
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On ascend platform, LMDeploy doesn't support kv quantization.
So, should we configure it?


longtext_model:
- /Qwen3-0.6B

benchmark_model:
- /Qwen3-0.6B
25 changes: 24 additions & 1 deletion autotest/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,17 @@

@pytest.fixture(scope='session')
def config():
config_path = os.path.join(config_file)
# Use device-specific config file if DEVICE environment variable is set
device = os.environ.get('DEVICE', '')
if device:
device_config_path = f'autotest/config-{device}.yaml'
if os.path.exists(device_config_path):
config_path = device_config_path
else:
config_path = config_file
else:
config_path = config_file

with open(config_path) as f:
env_config = yaml.load(f.read(), Loader=yaml.SafeLoader)
return env_config
Expand All @@ -34,8 +44,21 @@ def common_case_config():

def pytest_addoption(parser):
parser.addoption('--run_id', action='store', default='', help='github run_id')
parser.addoption('--device', action='store', default='', help='device config suffix')


def pytest_configure(config):
# Set DEVICE environment variable before test execution
device = config.getoption('--device')
if device:
os.environ['DEVICE'] = device


@pytest.fixture(scope='session')
def run_id(request):
return request.config.getoption('--run_id')


@pytest.fixture(scope='session')
def device(request):
return request.config.getoption('--device')
Loading