Skip to content

Commit c9b87ac

Browse files
committed
Merge branch 'master' into v4.0-release
2 parents 363c6ed + 8d73d4f commit c9b87ac

File tree

9 files changed

+202
-152
lines changed

9 files changed

+202
-152
lines changed

.github/workflows/tests.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,9 @@ on:
1111
- v*-release
1212
workflow_dispatch:
1313

14+
env:
15+
TRANSFORMERS_IS_CI: 1
16+
1417
jobs:
1518

1619
test_sampling:

docs/.htaccess

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -36,15 +36,13 @@ Redirect 301 /docs/pretrained-models/msmarco.html /docs/pretrained-models/msmarc
3636
Redirect 301 /docs/examples/training/sts/README.html /examples/sentence_transformer/training/sts/README.html
3737

3838
# Moved example pages for v4.0
39-
Redirect 301 /examples/applications/README.html /examples/cross_encoder/applications/README.html
40-
Redirect 301 /examples/training/ms_marco/cross_encoder_README.html /examples/cross_encoder/training/ms_marco/cross_encoder_README.html
41-
Redirect 301 /examples/training/README.html /examples/cross_encoder/training/README.html
39+
Redirect 301 /examples/training/ms_marco/cross_encoder_README.html /examples/cross_encoder/training/ms_marco/README.html
40+
Redirect 301 /examples/applications/cross-encoder/README.html /examples/cross_encoder/applications/README.html
4241
Redirect 301 /examples/applications/clustering/README.html /examples/sentence_transformer/applications/clustering/README.html
4342
Redirect 301 /examples/applications/embedding-quantization/README.html /examples/sentence_transformer/applications/embedding-quantization/README.html
4443
Redirect 301 /examples/applications/image-search/README.html /examples/sentence_transformer/applications/image-search/README.html
4544
Redirect 301 /examples/applications/parallel-sentence-mining/README.html /examples/sentence_transformer/applications/parallel-sentence-mining/README.html
4645
Redirect 301 /examples/applications/paraphrase-mining/README.html /examples/sentence_transformer/applications/paraphrase-mining/README.html
47-
Redirect 301 /examples/applications/README.html /examples/sentence_transformer/applications/README.html
4846
Redirect 301 /examples/applications/retrieve_rerank/README.html /examples/sentence_transformer/applications/retrieve_rerank/README.html
4947
Redirect 301 /examples/applications/semantic-search/README.html /examples/sentence_transformer/applications/semantic-search/README.html
5048
Redirect 301 /examples/applications/text-summarization/README.html /examples/sentence_transformer/applications/text-summarization/README.html

docs/sentence_transformer/usage/efficiency.rst

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,11 @@ To convert a model to ONNX format, you can use the following code:
9999
sentences = ["This is an example sentence", "Each sentence is converted"]
100100
embeddings = model.encode(sentences)
101101
102-
If the model path or repository already contains a model in ONNX format, Sentence Transformers will automatically use it. Otherwise, it will convert the model to ONNX the format.
102+
If the model path or repository already contains a model in ONNX format, Sentence Transformers will automatically use it. Otherwise, it will convert the model to the ONNX format.
103+
104+
.. note::
105+
106+
If you wish to use the ONNX model outside of Sentence Transformers, you'll need to perform pooling and/or normalization yourself. The ONNX export only converts the Transformer component, which outputs token embeddings, not sentence embeddings. To get sentence embeddings, you'll need to apply the appropriate pooling strategy (like mean pooling) and any normalization that the original model uses.
103107

104108
All keyword arguments passed via ``model_kwargs`` will be passed on to :meth:`ORTModel.from_pretrained <optimum.onnxruntime.ORTModel.from_pretrained>`. Some notable arguments include:
105109

@@ -291,6 +295,12 @@ To convert a model to OpenVINO format, you can use the following code:
291295
sentences = ["This is an example sentence", "Each sentence is converted"]
292296
embeddings = model.encode(sentences)
293297
298+
If the model path or repository already contains a model in OpenVINO format, Sentence Transformers will automatically use it. Otherwise, it will convert the model to the OpenVINO format.
299+
300+
.. note::
301+
302+
If you wish to use the OpenVINO model outside of Sentence Transformers, you'll need to perform pooling and/or normalization yourself. The OpenVINO export only converts the Transformer component, which outputs token embeddings, not sentence embeddings. To get sentence embeddings, you'll need to apply the appropriate pooling strategy (like mean pooling) and any normalization that the original model uses.
303+
294304
.. raw:: html
295305

296306
All keyword arguments passed via <code>model_kwargs</code> will be passed on to <a href="https://huggingface.co/docs/optimum/intel/openvino/reference#optimum.intel.openvino.modeling_base.OVBaseModel.from_pretrained"><code style="color: #404040; font-weight: 700;">OVBaseModel.from_pretrained()</code></a>. Some notable arguments include:

sentence_transformers/SentenceTransformer.py

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -391,6 +391,134 @@ def get_backend(self) -> Literal["torch", "onnx", "openvino"]:
391391
"""
392392
return self.backend
393393

394+
# Return a single tensor because we're passing a single sentence.
395+
@overload
396+
def encode(
397+
self,
398+
sentences: str,
399+
prompt_name: str | None = ...,
400+
prompt: str | None = ...,
401+
batch_size: int = ...,
402+
show_progress_bar: bool | None = ...,
403+
output_value: Literal["sentence_embedding", "token_embeddings"] = ...,
404+
precision: Literal["float32", "int8", "uint8", "binary", "ubinary"] = ...,
405+
convert_to_numpy: Literal[False] = ...,
406+
convert_to_tensor: bool = ...,
407+
device: str | None = ...,
408+
normalize_embeddings: bool = ...,
409+
**kwargs,
410+
) -> Tensor: ...
411+
412+
# Return a single array, because convert_to_numpy is True
413+
# and "sentence_embeddings" is passed
414+
@overload
415+
def encode(
416+
self,
417+
sentences: str | list[str] | np.ndarray,
418+
prompt_name: str | None = ...,
419+
prompt: str | None = ...,
420+
batch_size: int = ...,
421+
show_progress_bar: bool | None = ...,
422+
output_value: Literal["sentence_embedding"] = ...,
423+
precision: Literal["float32", "int8", "uint8", "binary", "ubinary"] = ...,
424+
convert_to_numpy: Literal[True] = ...,
425+
convert_to_tensor: Literal[False] = ...,
426+
device: str | None = ...,
427+
normalize_embeddings: bool = ...,
428+
**kwargs,
429+
) -> np.ndarray: ...
430+
431+
# Return a single tensor, because convert_to_tensor is True
432+
# and "sentence_embeddings" is passed
433+
@overload
434+
def encode(
435+
self,
436+
sentences: str | list[str] | np.ndarray,
437+
prompt_name: str | None = ...,
438+
prompt: str | None = ...,
439+
batch_size: int = ...,
440+
show_progress_bar: bool | None = ...,
441+
output_value: Literal["sentence_embedding"] = ...,
442+
precision: Literal["float32", "int8", "uint8", "binary", "ubinary"] = ...,
443+
convert_to_numpy: bool = ...,
444+
convert_to_tensor: Literal[True] = ...,
445+
device: str | None = ...,
446+
normalize_embeddings: bool = ...,
447+
**kwargs,
448+
) -> Tensor: ...
449+
450+
# Return a list of tensors. Value of convert_ doesn't matter.
451+
@overload
452+
def encode(
453+
self,
454+
sentences: list[str] | np.ndarray,
455+
prompt_name: str | None = ...,
456+
prompt: str | None = ...,
457+
batch_size: int = ...,
458+
show_progress_bar: bool | None = ...,
459+
output_value: Literal["sentence_embedding", "token_embeddings"] = ...,
460+
precision: Literal["float32", "int8", "uint8", "binary", "ubinary"] = ...,
461+
convert_to_numpy: bool = ...,
462+
convert_to_tensor: bool = ...,
463+
device: str | None = ...,
464+
normalize_embeddings: bool = ...,
465+
**kwargs,
466+
) -> list[Tensor]: ...
467+
468+
# Return a list of dict of features, ignore the conversion args.
469+
@overload
470+
def encode(
471+
self,
472+
sentences: list[str] | np.ndarray,
473+
prompt_name: str | None = ...,
474+
prompt: str | None = ...,
475+
batch_size: int = ...,
476+
show_progress_bar: bool | None = ...,
477+
output_value: None = ...,
478+
precision: Literal["float32", "int8", "uint8", "binary", "ubinary"] = ...,
479+
convert_to_numpy: bool = ...,
480+
convert_to_tensor: bool = ...,
481+
device: str | None = ...,
482+
normalize_embeddings: bool = ...,
483+
**kwargs,
484+
) -> list[dict[str, Tensor]]: ...
485+
486+
# Return a dict of features, ignore the conversion args.
487+
@overload
488+
def encode(
489+
self,
490+
sentences: str,
491+
prompt_name: str | None = ...,
492+
prompt: str | None = ...,
493+
batch_size: int = ...,
494+
show_progress_bar: bool | None = ...,
495+
output_value: None = ...,
496+
precision: Literal["float32", "int8", "uint8", "binary", "ubinary"] = ...,
497+
convert_to_numpy: bool = ...,
498+
convert_to_tensor: bool = ...,
499+
device: str | None = ...,
500+
normalize_embeddings: bool = ...,
501+
**kwargs,
502+
) -> dict[str, Tensor]: ...
503+
504+
# If "token_embeddings" is True, then the output is a single tensor.
505+
@overload
506+
def encode(
507+
self,
508+
sentences: str,
509+
prompt_name: str | None = ...,
510+
prompt: str | None = ...,
511+
batch_size: int = ...,
512+
show_progress_bar: bool | None = ...,
513+
output_value: Literal["token_embeddings"] = ...,
514+
precision: Literal["float32", "int8", "uint8", "binary", "ubinary"] = ...,
515+
convert_to_numpy: bool = ...,
516+
convert_to_tensor: bool = ...,
517+
device: str | None = ...,
518+
normalize_embeddings: bool = ...,
519+
**kwargs,
520+
) -> Tensor: ...
521+
394522
def encode(
395523
self,
396524
sentences: str | list[str] | np.ndarray,

sentence_transformers/SentenceTransformer.pyi

Lines changed: 0 additions & 136 deletions
This file was deleted.

sentence_transformers/cross_encoder/CrossEncoder.py

Lines changed: 39 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -167,11 +167,8 @@ def __init__(
167167
token=token,
168168
**model_kwargs,
169169
)
170-
if "model_max_length" not in tokenizer_kwargs:
171-
if max_length is not None:
172-
tokenizer_kwargs["model_max_length"] = max_length
173-
elif hasattr(self.config, "max_position_embeddings"):
174-
tokenizer_kwargs["model_max_length"] = self.config.max_position_embeddings
170+
if "model_max_length" not in tokenizer_kwargs and max_length is not None:
171+
tokenizer_kwargs["model_max_length"] = max_length
175172

176173
self.tokenizer = AutoTokenizer.from_pretrained(
177174
model_name_or_path,
@@ -182,6 +179,8 @@ def __init__(
182179
token=token,
183180
**tokenizer_kwargs,
184181
)
182+
if "model_max_length" not in tokenizer_kwargs and hasattr(self.config, "max_position_embeddings"):
183+
self.tokenizer.model_max_length = min(self.tokenizer.model_max_length, self.config.max_position_embeddings)
185184

186185
# Check if a readme exists
187186
model_card_path = load_file_path(
@@ -272,6 +271,10 @@ def num_labels(self) -> int:
272271
def max_length(self) -> int:
273272
return self.tokenizer.model_max_length
274273

274+
@max_length.setter
275+
def max_length(self, value: int) -> None:
276+
self.tokenizer.model_max_length = value
277+
275278
@property
276279
@deprecated(
277280
"The `default_activation_function` property was renamed and is now deprecated. "
@@ -583,6 +586,37 @@ def push_to_hub(
583586
create_pr: bool = False,
584587
tags: list[str] | None = None,
585588
) -> str:
589+
"""
590+
Upload the CrossEncoder model to the Hugging Face Hub.
591+
592+
Example:
593+
::
594+
595+
from sentence_transformers import CrossEncoder
596+
597+
model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L6-v2")
598+
model.push_to_hub("username/my-crossencoder-model")
599+
# => "https://huggingface.co/username/my-crossencoder-model"
600+
601+
Args:
602+
repo_id (str): The name of the repository on the Hugging Face Hub, e.g. "username/repo_name",
603+
"organization/repo_name" or just "repo_name".
604+
token (str, optional): The authentication token to use for the Hugging Face Hub API.
605+
If not provided, will use the token stored via the Hugging Face CLI.
606+
private (bool, optional): Whether to create a private repository. If not specified,
607+
the repository will be public.
608+
safe_serialization (bool, optional): Whether or not to convert the model weights in safetensors
609+
format for safer serialization. Defaults to True.
610+
commit_message (str, optional): The commit message to use for the push. Defaults to "Add new CrossEncoder model".
611+
exist_ok (bool, optional): If True, do not raise an error if the repository already exists.
612+
Ignored if ``create_pr=True``. Defaults to False.
613+
revision (str, optional): The git branch to commit to. Defaults to the head of the 'main' branch.
614+
create_pr (bool, optional): Whether to create a Pull Request with the upload or directly commit. Defaults to False.
615+
tags (list[str], optional): A list of tags to add to the model card. Defaults to None.
616+
617+
Returns:
618+
str: URL of the commit or pull request (if create_pr=True)
619+
"""
586620
api = HfApi(token=token)
587621
repo_url = api.create_repo(
588622
repo_id=repo_id,

0 commit comments

Comments
 (0)