@@ -21,6 +21,10 @@ llm模型导出onnx模型请使用[llm-export](https://github.com/wangzhaode/llm
2121| Qwen-7B-Chat | [ ![ Download] [ download-qwen-7b-chat-onnx ]] [ release-qwen-7b-chat-onnx ] | [ ![ Download] [ download-qwen-7b-chat-mnn ]] [ release-qwen-7b-chat-mnn ] |
2222| Baichuan2-7B-Chat | [ ![ Download] [ download-baichuan2-7b-chat-onnx ]] [ release-baichuan2-7b-chat-onnx ] | [ ![ Download] [ download-baichuan2-7b-chat-mnn ]] [ release-baichuan2-7b-chat-mnn ] |
2323| Llama-2-7b-chat | [ ![ Download] [ download-llama2-7b-chat-onnx ]] [ release-llama2-7b-chat-onnx ] | [ ![ Download] [ download-llama2-7b-chat-mnn ]] [ release-llama2-7b-chat-mnn ] |
24+ | Qwen-1_8B-Chat | [ ![ Download] [ download-qwen-1.8b-onnx ]] [ release-qwen-1.8b-onnx ] | [ ![ Download] [ download-qwen-1.8b-mnn ]] [ release-qwen-1.8b-mnn ] |
25+
26+ 其他版本:
27+ - Qwen-1_8B-Chat-int8:[ ![ Download] [ download-qwen-1.8b-mnn-int8 ]] [ release-qwen-1.8b-mnn-int8 ]
2428
2529[ download-chatglm-6b-onnx ] : https://img.shields.io/github/downloads/wangzhaode/llm-export/chatglm-6b-onnx/total
2630[ download-chatglm2-6b-onnx ] : https://img.shields.io/github/downloads/wangzhaode/llm-export/chatglm2-6b-onnx/total
@@ -29,30 +33,38 @@ llm模型导出onnx模型请使用[llm-export](https://github.com/wangzhaode/llm
2933[ download-qwen-7b-chat-onnx ] : https://img.shields.io/github/downloads/wangzhaode/llm-export/qwen-7b-chat-onnx/total
3034[ download-baichuan2-7b-chat-onnx ] : https://img.shields.io/github/downloads/wangzhaode/llm-export/baichuan2-7b-chat-onnx/total
3135[ download-llama2-7b-chat-onnx ] : https://img.shields.io/github/downloads/wangzhaode/llm-export/llama2-7b-chat-onnx/total
36+ [ download-qwen-1.8b-onnx ] : https://img.shields.io/github/downloads/wangzhaode/llm-export/qwen-1.8b-onnx/total
3237[ release-chatglm-6b-onnx ] : https://github.com/wangzhaode/llm-export/releases/tag/chatglm-6b-onnx
3338[ release-chatglm2-6b-onnx ] : https://github.com/wangzhaode/llm-export/releases/tag/chatglm2-6b-onnx
3439[ release-chatglm3-6b-onnx ] : https://github.com/wangzhaode/llm-export/releases/tag/chatglm3-6b-onnx
3540[ release-codegeex2-6b-onnx ] : https://github.com/wangzhaode/llm-export/releases/tag/codegeex2-6b-onnx
3641[ release-qwen-7b-chat-onnx ] : https://github.com/wangzhaode/llm-export/releases/tag/qwen-7b-chat-onnx
3742[ release-baichuan2-7b-chat-onnx ] : https://github.com/wangzhaode/llm-export/releases/tag/baichuan2-7b-chat-onnx
3843[ release-llama2-7b-chat-onnx ] : https://github.com/wangzhaode/llm-export/releases/tag/llama2-7b-chat-onnx
44+ [ release-qwen-1.8b-onnx ] : https://github.com/wangzhaode/llm-export/releases/tag/qwen-1.8b-onnx
3945[ download-chatglm-6b-mnn ] : https://img.shields.io/github/downloads/wangzhaode/mnn-llm/chatglm-6b-mnn/total
4046[ download-chatglm2-6b-mnn ] : https://img.shields.io/github/downloads/wangzhaode/mnn-llm/chatglm2-6b-mnn/total
4147[ download-chatglm3-6b-mnn ] : https://img.shields.io/github/downloads/wangzhaode/mnn-llm/chatglm3-6b-mnn/total
4248[ download-codegeex2-6b-mnn ] : https://img.shields.io/github/downloads/wangzhaode/mnn-llm/codegeex2-6b-mnn/total
4349[ download-qwen-7b-chat-mnn ] : https://img.shields.io/github/downloads/wangzhaode/mnn-llm/qwen-7b-chat-mnn/total
4450[ download-baichuan2-7b-chat-mnn ] : https://img.shields.io/github/downloads/wangzhaode/mnn-llm/baichuan2-7b-chat-mnn/total
4551[ download-llama2-7b-chat-mnn ] : https://img.shields.io/github/downloads/wangzhaode/mnn-llm/llama2-7b-chat-mnn/total
52+ [ download-qwen-1.8b-mnn ] : https://img.shields.io/github/downloads/wangzhaode/mnn-llm/qwen-1.8b-mnn/total
53+ [ download-qwen-1.8b-mnn-int8 ] : https://img.shields.io/github/downloads/wangzhaode/mnn-llm/qwen-1.8b-mnn-int8/total
4654[ release-chatglm-6b-mnn ] : https://github.com/wangzhaode/mnn-llm/releases/tag/chatglm-6b-mnn
4755[ release-chatglm2-6b-mnn ] : https://github.com/wangzhaode/mnn-llm/releases/tag/chatglm2-6b-mnn
4856[ release-chatglm3-6b-mnn ] : https://github.com/wangzhaode/mnn-llm/releases/tag/chatglm3-6b-mnn
4957[ release-codegeex2-6b-mnn ] : https://github.com/wangzhaode/mnn-llm/releases/tag/codegeex2-6b-mnn
5058[ release-qwen-7b-chat-mnn ] : https://github.com/wangzhaode/mnn-llm/releases/tag/qwen-7b-chat-mnn
5159[ release-baichuan2-7b-chat-mnn ] : https://github.com/wangzhaode/mnn-llm/releases/tag/baichuan2-7b-chat-mnn
5260[ release-llama2-7b-chat-mnn ] : https://github.com/wangzhaode/mnn-llm/releases/tag/llama2-7b-chat-mnn
61+ [ release-qwen-1.8b-mnn ] : https://github.com/wangzhaode/mnn-llm/releases/tag/qwen-1.8b-mnn
62+ [ release-qwen-1.8b-mnn-int8 ] : https://github.com/wangzhaode/mnn-llm/releases/tag/qwen-1.8b-mnn-int8
5363
5464### 速度
5565
66+ #### CPU 4线程速度: ` prefill / decode ` ` tok/s `
67+
5668| model | android(f16/32)| macos (f32) | linux (f32) | windows (f32) |
5769| :-----------------:| :--------------:| :-------------:| :--------------:| :--------------:|
5870| qwen-1.8b-int4 | 100.21 / 22.22 | 84.85 / 19.93 | 151.00 / 35.89 | 117.30 / 33.40 |
@@ -64,19 +76,16 @@ llm模型导出onnx模型请使用[llm-export](https://github.com/wangzhaode/llm
6476| baichuan2-7b-int4 | 13.87 / 6.08 | 17.21 / 6.10 | 30.11 / 10.87 | 26.31 / 9.84 |
6577| llama-2-7b-int4 | 17.98 / 5.17 | 19.72 / 5.06 | 34.47 / 9.29 | 28.66 / 8.90 |
6678
67- - android
68- - 测试设备: XiaoMi12
69- - 处理器: Snapdragon 8gen1
70- - 内存大小: 8 GB
71- - macos
72- - 测试设备: MacBook Pro 2019
73- - 处理器: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
74- - 内存大小: 16 GB
75- - linux(wsl)/windows
76- - 测试设备: PC
77- - 处理器: Intel(R) Core(TM) i7-13700K @ 3.40 GHz
78- - 内存大小: 32 GB
79- - CPU 4线程速度: prefill / decode ` tok/s `
79+ 测试的系统和设备信息如下,
80+
81+ | os | device | CPU | Memory |
82+ | :--:| :-------:| :----:| :--------:|
83+ | android | XiaoMi12 | Snapdragon 8gen1 | 8 GB |
84+ | macos | MacBook Pro 2019 | Intel(R) Core(TM) i7-9750H CPU | 16 GB |
85+ | linux | PC | Intel(R) Core(TM) i7-13700K | 32GB |
86+ | windows | PC | Intel(R) Core(TM) i7-13700K | 32GB |
87+
88+
8089
8190
8291### 下载int4模型
0 commit comments