huggingface
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/package_reference/cross_encoder.md‎
Lines changed: 1 addition & 0 deletions b/‎docs/package_reference/cross_encoder.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/package_reference/datasets.md‎
Lines changed: 0 additions & 5 deletions b/‎docs/package_reference/datasets.md‎
Lines changed: 0 additions & 5 deletions
diff --git a/‎docs/package_reference/models.md‎
Lines changed: 1 addition & 0 deletions b/‎docs/package_reference/models.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/pretrained_cross-encoders.md‎
Lines changed: 50 additions & 14 deletions b/‎docs/pretrained_cross-encoders.md‎
Lines changed: 50 additions & 14 deletions
diff --git a/‎docs/training/overview.md‎
Lines changed: 7 additions & 13 deletions b/‎docs/training/overview.md‎
Lines changed: 7 additions & 13 deletions
diff --git a/‎examples/applications/computing-embeddings/computing_embeddings.py‎
Lines changed: 1 addition & 1 deletion b/‎examples/applications/computing-embeddings/computing_embeddings.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/applications/cross-encoder/cross-encoder_reranking.py‎
Lines changed: 1 addition & 1 deletion b/‎examples/applications/cross-encoder/cross-encoder_reranking.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/applications/cross-encoder/cross-encoder_usage.py‎
Lines changed: 1 addition & 1 deletion b/‎examples/applications/cross-encoder/cross-encoder_usage.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/applications/information-retrieval/README.md‎
Lines changed: 4 additions & 4 deletions b/‎examples/applications/information-retrieval/README.md‎
Lines changed: 4 additions & 4 deletions
@@ -1,4 +1,5 @@
 .idea
+.vscode
 *.pyc
 *.gz
 *.tsv
 
@@ -10,6 +10,7 @@ For an introduction to Cross-Encoders, see [Cross-Encoders](../usage/cross-encod
 CrossEncoder have their own evaluation classes, that are in `sentence_transformers.cross_encoder.evaluation`.
 
 ```eval_rst
+.. autoclass:: sentence_transformers.cross_encoder.evaluation.CEBinaryAccuracyEvaluator
 .. autoclass:: sentence_transformers.cross_encoder.evaluation.CEBinaryClassificationEvaluator
 .. autoclass:: sentence_transformers.cross_encoder.evaluation.CECorrelationEvaluator
 .. autoclass:: sentence_transformers.cross_encoder.evaluation.CESoftmaxAccuracyEvaluator
 
@@ -2,11 +2,6 @@
 `sentence_transformers.datasets` contains classes to organize your training input examples.
 
 
-## SentencesDataset
-`SentencesDataset` is the main class to store training classes for training. For details, see [training overview](../training/overview.md). 
-```eval_rst
-.. autoclass:: sentence_transformers.datasets.SentencesDataset
-```
 
 ## ParallelSentencesDataset
 `ParallelSentencesDataset` is used for multilingual training. For details, see [multilingual training](../../examples/training/multilingual/README.md).
 
@@ -10,6 +10,7 @@
 
 ## Further Classes
 ```eval_rst
+.. autoclass:: sentence_transformers.models.Asym
 .. autoclass:: sentence_transformers.models.BoW
 .. autoclass:: sentence_transformers.models.CNN
 .. autoclass:: sentence_transformers.models.LSTM
 
@@ -7,41 +7,77 @@ This page lists available **pretrained Cross-Encoders**. Cross-Encoders require
 
 ## STSbenchmark
 The following models can be used like this:
-```
+```python
 from sentence_transformers import CrossEncoder
 model = CrossEncoder('model_name')
 scores = model.predict([('Sent A1', 'Sent B1'), ('Sent A2', 'Sent B2')])
 ```
 
 They return a score  0...1 indicating the semantic similarity of the given sentence pair.
-- **sentence-transformers/ce-distilroberta-base-stsb** - STSbenchmark test performance: 87.92
-- **sentence-transformers/ce-roberta-base-stsb** - STSbenchmark test performance: 90.17
-- **sentence-transformers/ce-roberta-large-stsb** - STSbenchmark test performance: 91.47 
+- **cross-encoder/stsb-TinyBERT-L-4** - STSbenchmark test performance: 85.50
+- **cross-encoder/stsb-distilroberta-base** - STSbenchmark test performance: 87.92
+- **cross-encoder/stsb-roberta-base** - STSbenchmark test performance: 90.17
+- **cross-encoder/stsb-roberta-large** - STSbenchmark test performance: 91.47 
 
 ## Quora Duplicate Questions
 These models have been trained on the [Quora duplicate questions dataset](https://www.quora.com/q/quoradata/First-Quora-Dataset-Release-Question-Pairs). They can used like the STSb models and give a score 0...1 indicating the probability that two questions are duplicate questions.
 
-- **sentence-transformers/ce-distilroberta-base-quora** - Average Precision dev set: 87.48
-- **sentence-transformers/ce-roberta-base-quora** - Average Precision dev set: 87.80
-- **sentence-transformers/ce-roberta-large-quora** - Average Precision dev set: 87.91
+- **cross-encoder/quora-distilroberta-base** - Average Precision dev set: 87.48
+- **cross-encoder/quora-roberta-base** - Average Precision dev set: 87.80
+- **cross-encoder/quora-roberta-large** - Average Precision dev set: 87.91
 
+Note: The model don't work for question similarity. The question *How to learn Java* and *How to learn Python* will get a low score, as these questions are not duplicates. For question similarity, the respective bi-encoder trained on the Quora dataset yields much more meaningful results.
 
 ## Information Retrieval
 
 The following models are trained for Information Retrieval: Given a query (like key-words or a question), and a paragraph, can the query be answered by the paragraph? The models have beend trained on MS Marco, a large dataset with real-user queries from Bing search engine.
 
 The models can be used like this:
-```
+```python
 from sentence_transformers import CrossEncoder
 model = CrossEncoder('model_name', max_length=512)
-scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2')])
+scores = model.predict([('Query1', 'Paragraph1'), ('Query2', 'Paragraph2')])
+
+#For Example
+scores = model.predict([('How many people live in Berlin?', 'Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.'), 
+                        ('What is the size of New York?', 'New York City is famous for the Metropolitan Museum of Art.')])
 ```
 
 This returns a score 0...1 indicating if the paragraph is relevant for a given query.
 
-- **sentence-transformers/ce-ms-marco-TinyBERT-L-2** - MRR@10 on MS Marco Dev Set: 30.15
-- **sentence-transformers/ce-ms-marco-TinyBERT-L-4** -  MRR@10 on MS Marco Dev Set: 34.50
-- **sentence-transformers/ce-ms-marco-TinyBERT-L-6** - MRR@10 on MS Marco Dev Set: 36.13
-- **sentence-transformers/ce-ms-marco-electra-base** - MRR@10 on MS Marco Dev Set: 36.41
 
-For details on the usage, see [Applications - Information Retrieval](../examples/applications/information-retrieval/README.md)
+For details on the usage, see [Applications - Information Retrieval](../examples/applications/information-retrieval/README.md)
+
+
+### MS MARCO
+[MS MARCO Passage Retrieval](https://github.com/microsoft/MSMARCO-Passage-Ranking) is a large dataset with real user queries from Bing search engine with annotated relevant text passages.
+- **cross-encoder/ms-marco-TinyBERT-L-2** - MRR@10 on MS Marco Dev Set: 30.15
+- **cross-encoder/ms-marco-TinyBERT-L-4** - MRR@10 on MS Marco Dev Set: 34.50
+- **cross-encoder/ms-marco-TinyBERT-L-6** - MRR@10 on MS Marco Dev Set: 36.13
+- **cross-encoder/ms-marco-electra-base** - MRR@10 on MS Marco Dev Set: 36.41
+
+### SQuAD (QNLI)
+
+QNLI is based on the [SQuAD dataset](https://rajpurkar.github.io/SQuAD-explorer/) and was introduced by the [GLUE Benchmar](https://arxiv.org/abs/1804.07461). Given a passage from Wikipedia, annotators created questions that are answerable by that passage.
+
+- **cross-encoder/qnli-distilroberta-base** - Accuracy on QNLI dev set: 90.96
+- **cross-encoder/qnli-electra-base** - Accuracy on QNLI dev set: 93.21
+
+
+
+## NLI
+Given two sentences, are these contradicting each other, entailing one the other or are these netural? The following models were trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets.
+- **cross-encoder/nli-distilroberta-base** - Accuracy on MNLI mismatched set: 83.98
+- **cross-encoder/nli-roberta-base** - Accuracy on MNLI mismatched set: 87.47
+- **cross-encoder/nli-deberta-base** - Accuracy on MNLI mismatched set: 88.08
+
+```python
+from sentence_transformers import CrossEncoder
+model = CrossEncoder('model_name')
+scores = model.predict([('A man is eating pizza', 'A man eats something'), ('A black race car starts up in front of a crowd of people.', 'A man is driving down a lonely road.')])
+
+#Convert scores to labels
+label_mapping = ['contradiction', 'entailment', 'neutral']
+labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)]
+```
+
@@ -57,19 +57,16 @@ For all available building blocks see [» Models Package Reference](../package_r
  To represent our training data, we use the `InputExample` class to store training examples. As parameters, it accepts texts, which is a list of strings representing our pairs (or triplets). Further, we can also pass a label (either float or int). The following shows a simple example, where we pass text pairs to `InputExample` together with a label indicating the semantic similarity.
 
  ```python
-from sentence_transformers import SentenceTransformer, SentencesDataset, InputExample
+from sentence_transformers import SentenceTransformer, InputExample
 from torch.utils.data import DataLoader
 
 model = SentenceTransformer('distilbert-base-nli-mean-tokens')
 train_examples = [InputExample(texts=['My first sentence', 'My second sentence'], label=0.8),
     InputExample(texts=['Another pair', 'Unrelated sentence'], label=0.3)]
-train_dataset = SentencesDataset(train_examples, model)
-train_dataloader = DataLoader(train_dataset, shuffle=True, batch_size=16)
+train_dataloader = DataLoader(train_examples, shuffle=True, batch_size=16)
  ```
 
-To prepare the examples for training, we provide a custom `SentencesDataset`, which is a [custom PyTorch dataset](https://pytorch.org/tutorials/beginner/data_loading_tutorial.html). It accepts as parameters the list with `InputExamples` and the `SentenceTransformer` model.
-
-We can wrap `SentencesDataset` with the standard PyTorch `DataLoader`, which produces for example batches and allows us to shuffle the data for training.
+We wrap our `train_examples` with the standard PyTorch `DataLoader`, which shuffles our data and produces batches of certain sizes.
 
 
 
@@ -92,7 +89,7 @@ For each sentence pair, we pass sentence A and sentence B through our network wh
 
 A minimal example with `CosineSimilarityLoss` is the following:
 ```python
-from sentence_transformers import SentenceTransformer, SentencesDataset, InputExample, losses
+from sentence_transformers import SentenceTransformer, InputExample, losses
 from torch.utils.data import DataLoader
 
 #Define the model. Either from scratch of by loading a pre-trained model
@@ -103,8 +100,7 @@ train_examples = [InputExample(texts=['My first sentence', 'My second sentence']
     InputExample(texts=['Another pair', 'Unrelated sentence'], label=0.3)]
 
 #Define your train dataset, the dataloader and the train loss
-train_dataset = SentencesDataset(train_examples, model)
-train_dataloader = DataLoader(train_dataset, shuffle=True, batch_size=16)
+train_dataloader = DataLoader(train_examples, shuffle=True, batch_size=16)
 train_loss = losses.CosineSimilarityLoss(model)
 
 #Tune the model
@@ -142,7 +138,7 @@ model.fit(train_objectives=[(train_dataloader, train_loss)], epochs=1, warmup_st
 
 
 ### Continue Training on Other Data
-[training_stsbenchmark_continue_training.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training_transformers/training_stsbenchmark_continue_training.py) shows an example where training on a fine-tuned model is continued. In that example, we use a sentence transformer model that was first fine-tuned on the NLI dataset and then continue training on the training data from the STS benchmark.
+[training_stsbenchmark_continue_training.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/sts/training_stsbenchmark_continue_training.py) shows an example where training on a fine-tuned model is continued. In that example, we use a sentence transformer model that was first fine-tuned on the NLI dataset and then continue training on the training data from the STS benchmark.
 
 First, we load a pre-trained model from the server:
 ```python
@@ -152,9 +148,7 @@ model = SentenceTransformer('bert-base-nli-mean-tokens')
 
 The next steps are as before. We specify training and dev data:
 ```python
-sts_reader = STSBenchmarkDataReader('datasets/stsbenchmark', normalize_scores=True)
-train_data = SentencesDataset(sts_reader.get_examples('sts-train.csv'), model)
-train_dataloader = DataLoader(train_data, shuffle=True, batch_size=train_batch_size)
+train_dataloader = DataLoader(train_samples, shuffle=True, batch_size=train_batch_size)
 train_loss = losses.CosineSimilarityLoss(model=model)
 
 evaluator = EmbeddingSimilarityEvaluator.from_input_examples(sts_reader.get_examples('sts-dev.csv'))
 
@@ -19,7 +19,7 @@
 
 
 # Load pre-trained Sentence Transformer Model (based on DistilBERT). It will be downloaded automatically
-model = SentenceTransformer('paraphrase-distilroberta-base-v1')
+model = SentenceTransformer('average_word_embeddings_glove.6B.300d')
 
 # Embed a list of sentences
 sentences = ['This framework generates embeddings for each input sentence',
 
@@ -22,7 +22,7 @@
 
 # To refine the results, we use a CrossEncoder. A CrossEncoder gets both inputs (input_question, retrieved_question)
 # and outputs a score 0...1 indicating the similarity.
-cross_encoder_model = CrossEncoder('sentence-transformers/ce-roberta-base-stsb')
+cross_encoder_model = CrossEncoder('cross-encoder/roberta-base-stsb')
 
 # Dataset we want to use
 url = "http://qim.fs.quoracdn.net/quora_duplicate_questions.tsv"
 
@@ -7,7 +7,7 @@
 import numpy as np
 
 # Pre-trained cross encoder
-model = CrossEncoder('sentence-transformers/ce-distilroberta-base-stsb')
+model = CrossEncoder('cross-encoder/distilroberta-base-stsb')
 
 # We want to compute the similarity between the query sentence
 query = 'A man is eating pasta.'
 
@@ -78,10 +78,10 @@ In the following table, we provide various pre-trained Cross-Encoders together w
 
 | Model-Name        | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev)  | Docs / Sec (BertTokenizerFast) | Docs / Sec |
 | ------------- |:-------------| -----| --- | --- |
-| sentence-transformers/ce-ms-marco-TinyBERT-L-2  | 67.43 | 30.15  | 9000 | 780
-| sentence-transformers/ce-ms-marco-TinyBERT-L-4  | 68.09 | 34.50  | 2900 | 760
-| sentence-transformers/ce-ms-marco-TinyBERT-L-6 |  69.57 | 36.13  | 680 | 660
-| sentence-transformers/ce-ms-marco-electra-base | 71.99 | 36.41 | 340 | 340
+| cross-encoder/ms-marco-TinyBERT-L-2  | 67.43 | 30.15  | 9000 | 780
+| cross-encoder/ms-marco-TinyBERT-L-4  | 68.09 | 34.50  | 2900 | 760
+| cross-encoder/ms-marco-TinyBERT-L-6 |  69.57 | 36.13  | 680 | 660
+| cross-encoder/ms-marco-electra-base | 71.99 | 36.41 | 340 | 340
 | *Other models* | | | |
 | nboost/pt-tinybert-msmarco | 63.63 | 28.80 | 2900 | 760
 | nboost/pt-bert-base-uncased-msmarco | 70.94 | 34.75 | 340 | 340|
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,5 @@`
`1`	`1`	`.idea`
	`2`	`+.vscode`
`2`	`3`	`*.pyc`
`3`	`4`	`*.gz`
`4`	`5`	`*.tsv`