Skip to content

Commit fdf7c38

Browse files
committed
new pretrained cross encoders
1 parent 0eb3826 commit fdf7c38

File tree

1 file changed

+23
-4
lines changed

1 file changed

+23
-4
lines changed

docs/pretrained_cross-encoders.md

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -36,22 +36,40 @@ The models can be used like this:
3636
```python
3737
from sentence_transformers import CrossEncoder
3838
model = CrossEncoder('model_name', max_length=512)
39-
scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2')])
39+
scores = model.predict([('Query1', 'Paragraph1'), ('Query2', 'Paragraph2')])
40+
41+
#For Example
42+
scores = model.predict([('How many people live in Berlin?', 'Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.'),
43+
('What is the size of New York?', 'New York City is famous for the Metropolitan Museum of Art.')])
4044
```
4145

4246
This returns a score 0...1 indicating if the paragraph is relevant for a given query.
4347

48+
49+
For details on the usage, see [Applications - Information Retrieval](../examples/applications/information-retrieval/README.md)
50+
51+
52+
### MS MARCO
53+
[MS MARCO Passage Retrieval](https://github.com/microsoft/MSMARCO-Passage-Ranking) is a large dataset with real user queries from Bing search engine with annotated relevant text passages.
4454
- **cross-encoder/ms-marco-TinyBERT-L-2** - MRR@10 on MS Marco Dev Set: 30.15
4555
- **cross-encoder/ms-marco-TinyBERT-L-4** - MRR@10 on MS Marco Dev Set: 34.50
4656
- **cross-encoder/ms-marco-TinyBERT-L-6** - MRR@10 on MS Marco Dev Set: 36.13
4757
- **cross-encoder/ms-marco-electra-base** - MRR@10 on MS Marco Dev Set: 36.41
4858

49-
For details on the usage, see [Applications - Information Retrieval](../examples/applications/information-retrieval/README.md)
59+
### SQuAD (QNLI)
60+
61+
QNLI is based on the [SQuAD dataset](https://rajpurkar.github.io/SQuAD-explorer/) and was introduced by the [GLUE Benchmar](https://arxiv.org/abs/1804.07461). Given a passage from Wikipedia, annotators created questions that are answerable by that passage.
62+
63+
- **cross-encoder/qnli-distilroberta-base** - Accuracy on QNLI dev set: 90.96
64+
- **cross-encoder/qnli-electra-base** - Accuracy on QNLI dev set: 93.21
65+
66+
5067

5168
## NLI
52-
The following models were trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets.
69+
Given two sentences, are these contradicting each other, entailing one the other or are these netural? The following models were trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets.
5370
- **cross-encoder/nli-distilroberta-base** - Accuracy on MNLI mismatched set: 83.98
5471
- **cross-encoder/nli-roberta-base** - Accuracy on MNLI mismatched set: 87.47
72+
- **cross-encoder/nli-deberta-base** - Accuracy on MNLI mismatched set: 88.08
5573

5674
```python
5775
from sentence_transformers import CrossEncoder
@@ -61,4 +79,5 @@ scores = model.predict([('A man is eating pizza', 'A man eats something'), ('A b
6179
#Convert scores to labels
6280
label_mapping = ['contradiction', 'entailment', 'neutral']
6381
labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)]
64-
```
82+
```
83+

0 commit comments

Comments
 (0)