Skip to content

Commit d6e6d29

Browse files
XiaotongJianggitbook-bot
authored andcommitted
GITBOOK-20: No subject
1 parent e6bbf86 commit d6e6d29

File tree

1 file changed

+17
-19
lines changed

1 file changed

+17
-19
lines changed

sglang-cookbook/gpt-oss/usage-guide.md

Lines changed: 17 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,13 @@
44

55
{% stepper %}
66
{% step %}
7-
### Install SGLang
7+
#### Install SGLang
88

9-
Following [the instruction](https://app.gitbook.com/o/TvLfyTxdRQeudJH7e5QW/s/FFtIWT8LEMaYiYzz0p8P/~/changes/11/sglang-cookbook/installation/nvidia-h-series-a-series-and-rtx-gpus)
9+
Following [the instruction](../installation/nvidia-h-series-a-series-and-rtx-gpus.md)
1010
{% endstep %}
1111

1212
{% step %}
13-
### Serve the model
13+
#### Serve the model
1414

1515
{% code overflow="wrap" %}
1616
```bash
@@ -28,7 +28,7 @@ python3 -m sglang.launch_server --model-path openai/gpt-oss-120b --mem-fraction-
2828
{% endstep %}
2929

3030
{% step %}
31-
### Benchmark
31+
#### Benchmark
3232

3333
SGLang version (0.5.1)
3434

@@ -50,13 +50,13 @@ SGLang version (0.5.1)
5050

5151
{% stepper %}
5252
{% step %}
53-
### Install SGLang
53+
#### Install SGLang
5454

55-
Following [the instruction](https://app.gitbook.com/o/TvLfyTxdRQeudJH7e5QW/s/FFtIWT8LEMaYiYzz0p8P/~/changes/11/sglang-cookbook/installation/nvidia-h-series-a-series-and-rtx-gpus)
55+
Following [the instruction](../installation/nvidia-h-series-a-series-and-rtx-gpus.md)
5656
{% endstep %}
5757

5858
{% step %}
59-
### Serve the model
59+
#### Serve the model
6060

6161
{% code overflow="wrap" %}
6262
```bash
@@ -67,23 +67,23 @@ python3 -m sglang.launch_server --model-path openai/gpt-oss-120b --tp 2
6767
{% endstep %}
6868

6969
{% step %}
70-
### Benchmark
70+
#### Benchmark
7171

72-
<table><thead><tr><th width="209.78515625">BS/Input/Output Length</th><th width="109.6328125">TTFT(s)</th><th width="101.75390625">ITL(ms)</th><th>Input Throughput</th><th>Output Throughput</th></tr></thead><tbody><tr><td colspan="5" style="text-align: center;">Benchmark results will be added here</td></tr></tbody></table>
72+
<table><thead><tr><th width="209.78515625">BS/Input/Output Length</th><th width="109.6328125">TTFT(s)</th><th width="101.75390625">ITL(ms)</th><th>Input Throughput</th><th>Output Throughput</th></tr></thead><tbody><tr><td>Benchmark results will be added here</td><td></td><td></td><td></td><td></td></tr></tbody></table>
7373
{% endstep %}
7474
{% endstepper %}
7575

7676
### <mark style="background-color:green;">Serving with 1 x B200</mark>
7777

7878
{% stepper %}
7979
{% step %}
80-
### Install SGLang
80+
#### Install SGLang
8181

8282
Following [the instruction](../installation/nvidia-blackwell-gpus.md)
8383
{% endstep %}
8484

8585
{% step %}
86-
### Serve the model
86+
#### Serve the model
8787

8888
{% code overflow="wrap" %}
8989
```bash
@@ -101,6 +101,12 @@ python3 -m sglang.launch_server --model-path openai/gpt-oss-120b
101101
{% endstep %}
102102

103103
{% step %}
104+
#### Benchmark
105+
106+
<table><thead><tr><th width="209.78515625">BS/Input/Output Length</th><th width="109.6328125">TTFT(s)</th><th width="101.75390625">ITL(ms)</th><th>Input Throughput</th><th>Output Throughput</th></tr></thead><tbody><tr><td>Benchmark results will be added here</td><td></td><td></td><td></td><td></td></tr></tbody></table>
107+
{% endstep %}
108+
{% endstepper %}
109+
104110
### With Speculative Decoding
105111

106112
{% code overflow="wrap" %}
@@ -121,14 +127,6 @@ python3 -m sglang.launch_server --model openai/gpt-oss-120b --speculative-algo E
121127
python3 -m sglang.launch_server --model openai/gpt-oss-120b --speculative-algo EAGLE3 --speculative-draft lmsys/EAGLE3-gpt-oss-120b-bf16 --speculative-num-steps 5 --speculative-eagle-topk 4 --speculative-num-draft-tokens 8 --attention-backend triton --tp 4
122128
```
123129
{% endcode %}
124-
{% endstep %}
125-
126-
{% step %}
127-
### Benchmark
128-
129-
<table><thead><tr><th width="209.78515625">BS/Input/Output Length</th><th width="109.6328125">TTFT(s)</th><th width="101.75390625">ITL(ms)</th><th>Input Throughput</th><th>Output Throughput</th></tr></thead><tbody><tr><td colspan="5" style="text-align: center;">Benchmark results will be added here</td></tr></tbody></table>
130-
{% endstep %}
131-
{% endstepper %}
132130

133131
### Responses API & Built-in Tools
134132

0 commit comments

Comments
 (0)