Skip to content

Commit adbf0ba

Browse files
Add backends: ONNX & OpenVINO + ONNX optimization, quantization (#2712)
* Add OpenVINO support * Fix OpenVINO test on Windows * Expand OpenVino support (remote models); add ONNX backend; tests Also update push_to_hub * Move OV test to test_backends * Update push_to_hub test monkeypatching * Remove some dead code * Skip multi-process tests for now Incompatible with OpenVINO imports; still works independently * Move export_optimized_onnx_model to backend.py * Update __init__ to address the export_optimized_onnx_model move * Remove dot in commit message * Add PR description for export_optimized_onnx_model * OpenVINO will override export=False; update tests * Add dynamic quantization exporting; docs; benchmarks, etc. * Require 4.41.0 for eval_strategy, etc. * Restrict optimum-intel rather than optimum * Use subfolder rather than relying only on file_name Relies on the upcoming optimum and optimum-intel versions; this is expected to fail until then. * Add link to OVBaseModel.from_pretrained * Add tips pointing to the new efficiency docs * Another pointer to the new efficiency docs * Expand the benchmark details * Update min. requirements to optimum 1.23.0 & optimum-intel 1.20.0 --------- Co-authored-by: Tom Aarsen <[email protected]>
1 parent 130bdea commit adbf0ba

File tree

23 files changed

+1327
-62
lines changed

23 files changed

+1327
-62
lines changed

.github/workflows/tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ jobs:
6161
- name: Install dependencies
6262
run: |
6363
python -m pip install --upgrade pip
64-
python -m pip install '.[dev,train]'
64+
python -m pip install '.[train, onnx, openvino, dev]'
6565
6666
- name: Run unit tests
6767
run: |

docs/_static/css/custom.css

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ dl.class > dt {
4141
border-color: rgb(55 65 81);
4242
background-color: #e3e3e3;
4343
color: #404040; /* Override the colors imposed by <a href> */
44+
max-width: 18rem;
4445
}
4546

4647
.components > .box:nth-child(1) > .header {

docs/conf.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@
4343
"sphinx.ext.intersphinx",
4444
"sphinx.ext.linkcode",
4545
"sphinx_inline_tabs",
46+
"sphinxcontrib.mermaid",
4647
]
4748

4849
# Add any paths that contain templates here, relative to this directory.
@@ -68,6 +69,7 @@
6869
"datasets": ("https://huggingface.co/docs/datasets/main/en/", None),
6970
"transformers": ("https://huggingface.co/docs/transformers/main/en/", None),
7071
"huggingface_hub": ("https://huggingface.co/docs/huggingface_hub/main/en/", None),
72+
"optimum": ("https://huggingface.co/docs/optimum/main/en/", None),
7173
"torch": ("https://pytorch.org/docs/stable/", None),
7274
}
7375

63.2 KB
Loading
57.4 KB
Loading

docs/installation.md

Lines changed: 60 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,14 @@
11
# Installation
22

3-
We recommend **Python 3.8+**, **[PyTorch 1.11.0+](https://pytorch.org/get-started/locally/)**, and **[transformers v4.34.0+](https://github.com/huggingface/transformers)**. There are three options to install Sentence Transformers:
3+
We recommend **Python 3.8+**, **[PyTorch 1.11.0+](https://pytorch.org/get-started/locally/)**, and **[transformers v4.41.0+](https://github.com/huggingface/transformers)**. There are 5 extra options to install Sentence Transformers:
44
* **Default:** This allows for loading, saving, and inference (i.e., getting embeddings) of models.
5-
* **Default and Training**: All of the above plus training.
5+
* **ONNX:** This allows for loading, saving, inference, optimizing, and quantizing of models using the ONNX backend.
6+
* **OpenVINO:** This allows for loading, saving, and inference of models using the OpenVINO backend.
7+
* **Default and Training**: Like **Default**, plus training.
68
* **Development**: All of the above plus some dependencies for developing Sentence Transformers, see [Editable Install](#editable-install).
79

10+
Note that you can mix and match the various extras, e.g. ``pip install -U "sentence-transformers[train, onnx-gpu]"``
11+
812
## Install with pip
913

1014
```eval_rst
@@ -15,6 +19,24 @@ We recommend **Python 3.8+**, **[PyTorch 1.11.0+](https://pytorch.org/get-starte
1519
1620
pip install -U sentence-transformers
1721
22+
.. tab:: ONNX
23+
24+
For GPU and CPU:
25+
::
26+
27+
pip install -U "sentence-transformers[onnx-gpu]"
28+
29+
For CPU only:
30+
::
31+
32+
pip install -U "sentence-transformers[onnx]"
33+
34+
.. tab:: OpenVINO
35+
36+
::
37+
38+
pip install -U "sentence-transformers[openvino]"
39+
1840
.. tab:: Default and Training
1941
2042
::
@@ -47,6 +69,24 @@ We recommend **Python 3.8+**, **[PyTorch 1.11.0+](https://pytorch.org/get-starte
4769
4870
conda install -c conda-forge sentence-transformers
4971
72+
.. tab:: ONNX
73+
74+
For GPU and CPU:
75+
::
76+
77+
pip install -U "sentence-transformers[onnx-gpu]"
78+
79+
For CPU only:
80+
::
81+
82+
pip install -U "sentence-transformers[onnx]"
83+
84+
.. tab:: OpenVINO
85+
86+
::
87+
88+
pip install -U "sentence-transformers[openvino]"
89+
5090
.. tab:: Default and Training
5191
5292
::
@@ -81,6 +121,24 @@ You can install ``sentence-transformers`` directly from source to take advantage
81121
82122
pip install git+https://github.com/UKPLab/sentence-transformers.git
83123
124+
.. tab:: ONNX
125+
126+
For GPU and CPU:
127+
::
128+
129+
pip install -U "sentence-transformers[onnx-gpu] @ git+https://github.com/UKPLab/sentence-transformers.git"
130+
131+
For CPU only:
132+
::
133+
134+
pip install -U "sentence-transformers[onnx] @ git+https://github.com/UKPLab/sentence-transformers.git"
135+
136+
.. tab:: OpenVINO
137+
138+
::
139+
140+
pip install -U "sentence-transformers[openvino] @ git+https://github.com/UKPLab/sentence-transformers.git"
141+
84142
.. tab:: Default and Training
85143
86144
::

docs/package_reference/util.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,12 @@
77
:members: paraphrase_mining, semantic_search, community_detection, http_get, truncate_embeddings, normalize_embeddings, is_training_available, mine_hard_negatives
88
```
99

10+
## Model Optimization
11+
```eval_rst
12+
.. automodule:: sentence_transformers.backend
13+
:members: export_optimized_onnx_model, export_dynamic_quantized_onnx_model
14+
```
15+
1016
## Similarity Metrics
1117

1218
```eval_rst

docs/quickstart.rst

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ Once you have `installed <installation.html>`_ Sentence Transformers, you can ea
2323

2424
- :meth:`SentenceTransformer.similarity_pairwise <sentence_transformers.SentenceTransformer.similarity_pairwise>`
2525
- `SentenceTransformer > Usage <./sentence_transformer/usage/usage.html>`_
26+
- `SentenceTransformer > Usage > Speeding up Inference <./sentence_transformer/usage/efficiency.html>`_
2627
- `SentenceTransformer > Pretrained Models <./sentence_transformer/pretrained_models.html>`_
2728
- `SentenceTransformer > Training Overview <./sentence_transformer/training_overview.html>`_
2829
- `SentenceTransformer > Dataset Overview <./sentence_transformer/dataset_overview.html>`_
@@ -55,10 +56,14 @@ Once you have `installed <installation.html>`_ Sentence Transformers, you can ea
5556
# [0.6660, 1.0000, 0.1411],
5657
# [0.1046, 0.1411, 1.0000]])
5758

58-
With ``SentenceTransformer("all-MiniLM-L6-v2")`` we pick which `Sentence Transformer model <https://huggingface.co/models?library=sentence-transformers>`_ we load. In this example, we load `all-MiniLM-L6-v2 <https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2>`_, which is a MiniLM model finetuned on a large dataset of over 1 billion training pairs. Using `SentenceTransformer.similarity() <./package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer.similarity>`_, we compute the similarity between all pairs of sentences. As expected, the similarity between the first two sentences (0.6660) is higher than the similarity between the first and the third sentence (0.1046) or the second and the third sentence (0.1411).
59+
With ``SentenceTransformer("all-MiniLM-L6-v2")`` we pick which `Sentence Transformer model <https://huggingface.co/models?library=sentence-transformers>`_ we load. In this example, we load `all-MiniLM-L6-v2 <https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2>`_, which is a MiniLM model finetuned on a large dataset of over 1 billion training pairs. Using :meth:`SentenceTransformer.similarity() <sentence_transformers.SentenceTransformer.similarity>`, we compute the similarity between all pairs of sentences. As expected, the similarity between the first two sentences (0.6660) is higher than the similarity between the first and the third sentence (0.1046) or the second and the third sentence (0.1411).
5960

6061
Finetuning Sentence Transformer models is easy and requires only a few lines of code. For more information, see the `Training Overview <./sentence_transformer/training_overview.html>`_ section.
6162

63+
.. tip::
64+
65+
Read `Sentence Transformer > Usage > Speeding up Inference <sentence_transformer/usage/efficiency.html>`_ for tips on how to speed up inference of models by up to 2x-3x.
66+
6267
Cross Encoder
6368
-------------
6469

docs/requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,5 @@ sphinx_markdown_tables==0.0.17
66
recommonmark==0.7.1
77
sphinx-copybutton==0.5.2
88
sphinx_inline_tabs==2023.4.21
9+
sphinxcontrib-mermaid==0.8.1
910
-e ..

docs/sentence_transformer/pretrained_models.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,10 @@ similarities = model.similarity(embeddings, embeddings)
3131
3232
- **Model sizes**: it is recommended to filter away the large models that might not be feasible without excessive hardware.
3333
- **Experimentation is key**: models that perform well on the leaderboard do not necessarily do well on your tasks, it is **crucial** to experiment with various promising models.
34+
35+
.. tip::
36+
37+
Read `Sentence Transformer > Usage > Speeding up Inference <./usage/efficiency.html>`_ for tips on how to speed up inference of models by up to 2x-3x.
3438
```
3539

3640
## Original Models

0 commit comments

Comments
 (0)