Skip to content

Commit b9e6223

Browse files
jeremyfowersamd-pworfolkdanielholanda
authored
Release v5.0.3: Lemonade installer and examples, repo reorg, and lots more (#275)
- Docs, test, and examples have been moved into `turnkey` (CNNs and Transformers) vs. `lemonade` (LLMs) directories (@jeremyfowers) - For example: `docs/lemonade/getting_started.md` instead of `docs/lemonade_getting_started.md` - Track the memory utilization of any `lemonade` or `turnkey` command and plot it on a graph by setting the `--memory` option (@amd-pworfolk). - Add examples and demo applications for the high-level LEAP APIs in `examples/lemonade` (@jeremyfowers). - Add LEAP support for all OGA backends (@jeremyfowers). - Extend the `llm-prompt` tool to make it more useful for model and framework validation (@amd-pworfolk). - Updates and fixes to lemonade test code in `llm_api.py` (@jeremyfowers). - Fix not_enough_tokens bug on oga-bench (@danielholanda). --------- Co-authored-by: amd-pworfolk <[email protected]> Co-authored-by: Daniel Holanda <[email protected]>
1 parent 30eb2eb commit b9e6223

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

78 files changed

+1555
-561
lines changed

.github/workflows/test_lemonade.yml

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ jobs:
4545
shell: bash -el {0}
4646
run: |
4747
pylint src/lemonade --rcfile .pylintrc --disable E0401
48+
pylint examples --rcfile .pylintrc --disable E0401,E0611 --jobs=1
4849
- name: Test HF+CPU server
4950
if: runner.os == 'Windows'
5051
timeout-minutes: 10
@@ -55,7 +56,14 @@ jobs:
5556
- name: Run lemonade tests
5657
shell: bash -el {0}
5758
run: |
58-
lemonade -i facebook/opt-125m huggingface-load llm-prompt -p "hi" --max-new-tokens 10
59-
python test/llm_api.py
59+
# Test CLI
60+
lemonade -m -i facebook/opt-125m huggingface-load llm-prompt -p "hi" --max-new-tokens 10
6061
62+
# Test low-level APIs
63+
python test/lemonade/llm_api.py
64+
65+
66+
# Test high-level LEAP APIs
67+
python examples/lemonade/leap_basic.py
68+
python examples/lemonade/leap_streaming.py
6169

.github/workflows/test_lemonade_oga_cpu.yml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,13 @@ jobs:
5555
env:
5656
HF_TOKEN: "${{ secrets.HUGGINGFACE_ACCESS_TOKEN }}" # Required by OGA model_builder in OGA 0.4.0 but not future versions
5757
run: |
58+
# Test CLI
5859
lemonade -i TinyPixel/small-llama2 oga-load --device cpu --dtype int4 llm-prompt -p "tell me a story" --max-new-tokens 5
59-
python test/oga_cpu_api.py
60+
61+
# Test low-level APIs
62+
python test/lemonade/oga_cpu_api.py
63+
64+
# Test high-level LEAP APIs
65+
python examples/lemonade/leap_oga_cpu.py
66+
python examples/lemonade/leap_oga_cpu_streaming.py
6067

.github/workflows/test_turnkey.yml

Lines changed: 16 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,11 @@ on:
88
branches: ["main", "canary", "refresh"]
99
pull_request:
1010
branches: ["main", "canary", "refresh"]
11+
paths:
12+
- src/turnkeyml/**
13+
- test/turnkey/**
14+
- examples/turnkey/**
15+
- .github/workflows/test_turnkey.yml
1116

1217
permissions:
1318
contents: read
@@ -50,68 +55,32 @@ jobs:
5055
shell: bash -el {0}
5156
run: |
5257
# Unit tests
53-
python test/unit.py
58+
python test/turnkey/unit.py
5459
5560
# turnkey examples
5661
# Note: we clear the default cache location prior to each block of example runs
5762
rm -rf ~/.cache/turnkey
58-
python examples/api/onnx_opset.py --onnx-opset 15
59-
python examples/api/loading_a_build.py
63+
python examples/turnkey/api/onnx_opset.py --onnx-opset 15
64+
python examples/turnkey/api/loading_a_build.py
6065
6166
rm -rf ~/.cache/turnkey
62-
turnkey -i examples/cli/scripts/hello_world.py discover export-pytorch benchmark
67+
turnkey -i examples/turnkey/cli/scripts/hello_world.py discover export-pytorch benchmark
6368
rm -rf ~/.cache/turnkey
64-
turnkey -i examples/cli/scripts/multiple_invocations.py discover export-pytorch benchmark
69+
turnkey -i examples/turnkey/cli/scripts/multiple_invocations.py discover export-pytorch benchmark
6570
rm -rf ~/.cache/turnkey
66-
turnkey -i examples/cli/scripts/max_depth.py discover --max-depth 1 export-pytorch benchmark
71+
turnkey -i examples/turnkey/cli/scripts/max_depth.py discover --max-depth 1 export-pytorch benchmark
6772
rm -rf ~/.cache/turnkey
68-
turnkey -i examples/cli/scripts/two_models.py discover export-pytorch benchmark
73+
turnkey -i examples/turnkey/cli/scripts/two_models.py discover export-pytorch benchmark
6974
rm -rf ~/.cache/turnkey
70-
turnkey -i examples/cli/onnx/hello_world.onnx load-onnx benchmark
75+
turnkey -i examples/turnkey/cli/onnx/hello_world.onnx load-onnx benchmark
7176
7277
# E2E tests
73-
cd test/
78+
cd test/turnkey
7479
python cli.py
7580
python analysis.py
7681
- name: Test example plugins
7782
shell: bash -el {0}
7883
run: |
7984
rm -rf ~/.cache/turnkey
80-
pip install -e examples/cli/plugins/example_tool
81-
turnkey -i examples/cli/scripts/hello_world.py discover export-pytorch example-plugin-tool benchmark
82-
# - name: Install and Start Slurm
83-
# if: runner.os != 'Windows'
84-
# shell: bash -el {0}
85-
# run: |
86-
# sudo apt update -y
87-
# sudo apt install slurm-wlm -y
88-
# cp test/helpers/slurm.conf test/helpers/slurm_modified.conf
89-
# sed -i "s/YOUR_HOSTNAME_HERE/$HOSTNAME/" test/helpers/slurm_modified.conf
90-
# sudo mv test/helpers/slurm_modified.conf /etc/slurm/slurm.conf
91-
# sudo service slurmd start
92-
# sudo service slurmctld start
93-
# sudo service munge start
94-
# - name: Test turnkey on Slurm
95-
# if: runner.os != 'Windows'
96-
# shell: bash -el {0}
97-
# run: |
98-
# # Create conda environment for Slurm using srun (sbatch + wait)
99-
# export SKIP_REQUIREMENTS_INSTALL="True"
100-
# export TORCH_CPU="True"
101-
# srun src/turnkeyml/cli/setup_venv.sh
102-
103-
# # Run tests on Slurm
104-
# export TURNKEY_SLURM_USE_DEFAULT_MEMORY="True"
105-
# turnkey -i models/selftest/linear.py --use-slurm --cache-dir local_cache discover export-pytorch
106-
# bash test/helpers/check_slurm_output.sh slurm-2.out
107-
108-
# Below tests are commented out as the GitHub runner runs out of space installing the requirements
109-
# - name: Check installation of requirements.txt and their compatibility with turnkey
110-
# shell: bash -el {0}
111-
# run: |
112-
# conda create --name test-requirements python=3.8
113-
# conda activate test-requirements
114-
# pip install -r models/requirements.txt
115-
# python -m pip check
116-
# python -c "import torch_geometric"
117-
# conda deactivate
85+
pip install -e examples/turnkey/cli/plugins/example_tool
86+
turnkey -i examples/turnkey/cli/scripts/hello_world.py discover export-pytorch example-plugin-tool benchmark

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,10 @@
77

88
We are on a mission to make it easy to use the most important tools in the ONNX ecosystem. TurnkeyML accomplishes this by providing no-code CLIs and low-code APIs for both general ONNX workflows with `turnkey` as well as LLMs with `lemonade`.
99

10-
| [**Lemonade**](https://github.com/onnx/turnkeyml/tree/main/src/turnkeyml/llm) | [**Turnkey**](https://github.com/onnx/turnkeyml/blob/main/docs/classic_getting_started.md) |
10+
| [**Lemonade**](https://github.com/onnx/turnkeyml/tree/main/src/turnkeyml/llm) | [**Turnkey**](https://github.com/onnx/turnkeyml/blob/main/docs/turnkey/getting_started.md) |
1111
|:----------------------------------------------: |:-----------------------------------------------------------------: |
12-
| Serve and benchmark LLMs on CPU, GPU, and NPU. <br/> [Click here to get started with `lemonade`.](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade_getting_started.md) | Export and optimize ONNX models for CNNs and Transformers. <br/> [Click here to get started with `turnkey`.](https://github.com/onnx/turnkeyml/blob/main/docs/classic_getting_started.md) |
13-
| <img src="img/llm_demo.png"/> | <img src="img/classic_demo.png"/> |
12+
| Serve and benchmark LLMs on CPU, GPU, and NPU. <br/> [Click here to get started with `lemonade`.](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/getting_started.md) | Export and optimize ONNX models for CNNs and Transformers. <br/> [Click here to get started with `turnkey`.](https://github.com/onnx/turnkeyml/blob/main/docs/turnkey/getting_started.md) |
13+
| <img src="https://github.com/onnx/turnkeyml/blob/main/img/llm_demo.png?raw=true"/> | <img src="https://github.com/onnx/turnkeyml/blob/main/img/classic_demo.png?raw=true"/> |
1414

1515

1616
## How It Works

docs/code.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@
55
The TurnkeyML source code has a few major top-level directories:
66
- `docs`: documentation for the entire project.
77
- `examples`: example scripts for use with the TurnkeyML tools.
8-
- `examples/cli`: tutorial series starting in `examples/cli/readme.md` to help learn the `turnkey` CLI.
9-
- `examples/cli/scripts`: example scripts that can be fed as input into the `turnkey` CLI. These scripts each have a docstring that recommends one or more `turnkey` CLI commands to try out.
10-
- `examples/api`: examples scripts that invoke `Tools` via APIs.
8+
- `examples/turnkey/cli`: tutorial series starting in `examples/turnkey/cli/readme.md` to help learn the `turnkey` CLI.
9+
- `examples/turnkey/cli/scripts`: example scripts that can be fed as input into the `turnkey` CLI. These scripts each have a docstring that recommends one or more `turnkey` CLI commands to try out.
10+
- `examples/turnkey/api`: examples scripts that invoke `Tools` via APIs.
1111
- `models`: the corpora of models that makes up the TurnkeyML models (see [the models readme](https://github.com/onnx/turnkeyml/blob/main/models/readme.md)).
1212
- Each subdirectory under `models` represents a corpus of models pulled from somewhere on the internet. For example, `models/torch_hub` is a corpus of models from [Torch Hub](https://github.com/pytorch/hub).
1313
- `src/turnkeyml`: source code for the TurnkeyML package.
@@ -20,8 +20,8 @@ The TurnkeyML source code has a few major top-level directories:
2020
- `src/turnkeyml/state.py`: implements the `State` class.
2121
- `src/turnkeyml/files_api.py`: implements the `evaluate_files()` API, which is the top-level API called by the CLI.
2222
- `test`: tests for the TurnkeyML tools.
23-
- `test/analysis.py`: tests focusing on the `discover` `Tool`.
24-
- `test/cli.py`: tests focusing on top-level CLI features.
23+
- `test/turnkey/analysis.py`: tests focusing on the `discover` `Tool`.
24+
- `test/turnkey/cli.py`: tests focusing on top-level CLI features.
2525

2626
## Tool Classes
2727

docs/contribute.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ We require the following naming scheme:
8888

8989
### Example
9090

91-
See the [example_tool](https://github.com/onnx/turnkeyml/tree/main/examples/cli/plugins/example_tool) plugin for an example.
91+
See the [example_tool](https://github.com/onnx/turnkeyml/tree/main/examples/turnkey/cli/plugins/example_tool) plugin for an example.
9292

9393
The `__init__.py` file with its `implements` dictionary looks like:
9494

docs/lemonade_getting_started.md renamed to docs/lemonade/getting_started.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,16 @@ That command will run a few warmup iterations, then a few generation iterations
6464

6565
The prompt size, number of output tokens, and number iterations are all parameters. Learn more by running `lemonade huggingface-bench -h`.
6666

67+
## Memory Usage
68+
69+
The peak memory used by the lemonade build is captured in the build output. To capture more granular
70+
memory usage information, use the `--memory` flag. For example:
71+
72+
`lemonade -i facebook/opt-125m --memory huggingface-load huggingface-bench`
73+
74+
In this case a `memory_usage.png` file will be generated and stored in the build folder. This file
75+
contains a figure plotting the memory usage over the build time. Learn more by running `lemonade -h`.
76+
6777
## Serving
6878

6979
You can launch a WebSocket server for your LLM with:
@@ -111,9 +121,9 @@ You can also try Phi-3-Mini-128k-Instruct with the following commands:
111121

112122
`lemonade -i microsoft/Phi-3-mini-4k-instruct oga-load --device igpu --dtype int4 serve`
113123

114-
You can learn more about the CPU and iGPU support in our [OGA documentation](https://github.com/onnx/turnkeyml/blob/main/docs/ort_genai_igpu.md).
124+
You can learn more about the CPU and iGPU support in our [OGA documentation](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/ort_genai_igpu.md).
115125

116-
> Note: early access to AMD's RyzenAI NPU is also available. See the [RyzenAI NPU OGA documentation](https://github.com/onnx/turnkeyml/blob/main/docs/ort_genai_npu.md) for more information.
126+
> Note: early access to AMD's RyzenAI NPU is also available. See the [RyzenAI NPU OGA documentation](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/ort_genai_npu.md) for more information.
117127
118128
## Install RyzenAI NPU for PyTorch
119129

@@ -131,7 +141,7 @@ If you decide to contribute, please:
131141

132142
- do so via a pull request.
133143
- write your code in keeping with the same style as the rest of this repo's code.
134-
- add a test under `test/llm_api.py` that provides coverage of your new feature.
144+
- add a test under `test/lemonade/llm_api.py` that provides coverage of your new feature.
135145

136146
The best way to contribute is to add new tools to cover more devices and usage scenarios.
137147

File renamed without changes.
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)