v0.7.3

lvhan028 released this 14 Apr 10:04

· 1 commit to dev since this release

231a323

What's Changed

🚀 Features

Add Qwen3 and Qwen3MoE by @lzhangzz in #3305
[Feature] support qwen3 and qwen3-moe for pytorch engine by @CUHKSZzxy in #3315
[ascend]support deepseekv2 by @yao-fengchen in #3206
support ascend w8a8 graph_mode by @yao-fengchen in #3267
support Llama4 by @grimoire in #3408

💥 Improvements

Add spaces_between_special_tokens to /v1/interactive and make compatible with empty text by @AllentDan in #3283
add env var to control timeout by @CUHKSZzxy in #3291
optimize mla, remove load v by @grimoire in #3334
refactor dlinfer rope by @yao-fengchen in #3326
enable qwenvl2.5 graph mode on ascend by @jinminxi104 in #3367
Optimize ascend moe by @yao-fengchen in #3364
find port by @grimoire in #3429

🐞 Bug fixes

fix activation grid oversize by @grimoire in #3282
Set ensure_ascii=False for tool calling by @AllentDan in #3295
add v check by @grimoire in #3307
Fix Qwen3MoE config parsing by @lzhangzz in #3336
Fix finish reasons by @AllentDan in #3338
remove think_end_token_id in streaming content by @AllentDan in #3327
Fix the finish_reason by @AllentDan in #3350
support List[dict] prompt input without do_preprocess by @irexyc in #3385
fix tensor dispatch in dynamo by @wanfengcxz in #3417

📚 Documentations

update ascend doc by @yao-fengchen in #3420

🌐 Other

bump version to v0.7.2.post1 by @lvhan028 in #3298
Optimize internvit by @caikun-pjlab in #3316
bump version to v0.7.3 by @lvhan028 in #3416

New Contributors

@wanfengcxz made their first contribution in #3417
@caikun-pjlab made their first contribution in #3316

Full Changelog: v0.7.2...v0.7.3

Contributors

grimoire, lvhan028, and 8 other contributors

Assets 12