Skip to content

Commit 32393df

Browse files
authored
Add oumi[quantization] optional dependency (#1902)
1 parent 48131d7 commit 32393df

File tree

6 files changed

+28
-14
lines changed

6 files changed

+28
-14
lines changed

configs/examples/quantization/README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44
55
This directory contains example configurations for model quantization using Oumi's AWQ and BitsAndBytes quantization methods.
66

7+
> **NOTE**: Quantization requires a GPU to run.
8+
79
## Configuration Files
810

911
- **`awq_quantization_config.yaml`** - AWQ 4-bit quantization with calibration
@@ -15,7 +17,7 @@ This directory contains example configurations for model quantization using Oumi
1517
# Simplest command-line usage
1618
oumi quantize --method awq_q4_0 --model "TinyLlama/TinyLlama-1.1B-Chat-v1.0" --output quantized_model
1719

18-
# Using configuration file (requires GPU)
20+
# Using configuration file
1921
oumi quantize --config configs/examples/quantization/awq_quantization_config.yaml
2022
```
2123

@@ -40,10 +42,12 @@ oumi quantize --config configs/examples/quantization/awq_quantization_config.yam
4042
## Requirements
4143

4244
```bash
43-
# For AWQ quantization
45+
pip install oumi[quantization]
46+
47+
# Alternatively, for AWQ quantization only
4448
pip install autoawq
4549

46-
# For BitsAndBytes quantization
50+
# Alternatively, for BitsAndBytes quantization only
4751
pip install bitsandbytes
4852
```
4953

docs/user_guides/quantization.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44
55
This guide covers the `oumi quantize` command for reducing model size while maintaining performance.
66

7+
> **NOTE**: Quantization requires a GPU to run.
8+
79
## Quick Start
810

911
```bash
@@ -93,10 +95,12 @@ Currently supported output formats:
9395
## Installation
9496

9597
```bash
96-
# For AWQ quantization
98+
pip install oumi[quantization]
99+
100+
# Alternatively, for AWQ quantization only
97101
pip install autoawq
98102

99-
# For BitsAndBytes quantization
103+
# Alternatively, for BitsAndBytes quantization only
100104
pip install bitsandbytes
101105
```
102106

notebooks/Oumi - A Tour.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -493,7 +493,7 @@
493493
"name": "python",
494494
"nbconvert_exporter": "python",
495495
"pygments_lexer": "ipython3",
496-
"version": "3.11.8"
496+
"version": "3.11.13"
497497
}
498498
},
499499
"nbformat": 4,

notebooks/Oumi - Quantization Tutorial.ipynb

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -31,13 +31,17 @@
3131
"\n",
3232
"⚠️ **DEVELOPMENT STATUS**: The quantization feature is currently under active development. Some features may change in future releases.\n",
3333
"\n",
34-
"First, let's install Oumi with GPU support and the required quantization libraries:\n",
35-
"\n",
36-
"```bash\n",
37-
"pip install oumi[gpu]\n",
38-
"pip install autoawq\n",
39-
"pip install triton==3.0.0 # Required for AWQ inference compatibility\n",
40-
"```"
34+
"First, let's install Oumi with GPU support and the required quantization libraries:"
35+
]
36+
},
37+
{
38+
"cell_type": "code",
39+
"execution_count": null,
40+
"metadata": {},
41+
"outputs": [],
42+
"source": [
43+
"%pip install oumi[gpu,quantization]\n",
44+
"%pip install triton==3.0.0 # Required for AWQ inference compatibility"
4145
]
4246
},
4347
{

pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,8 @@ evaluation = [
154154
"sentencepiece>=0.1.98",
155155
]
156156

157+
quantization = ["autoawq>=0.2.0,<0.3", "bitsandbytes>=0.45.0,<0.46"]
158+
157159
bitnet = ["onebitllms>=0.0.3"]
158160

159161
cambrian = [

src/oumi/quantize/awq_quantizer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ def raise_if_requirements_not_met(self):
6060
if self._awq is None:
6161
raise RuntimeError(
6262
"AWQ quantization requires autoawq library.\n"
63-
"Install with: `pip install autoawq`\n"
63+
"Install with: `pip install oumi[quantization]`\n"
6464
)
6565

6666
if not torch.cuda.is_available():

0 commit comments

Comments
 (0)