Reproducing the results with LLaMa and TriviaQA (Figure 8)

Hi,

Thank you for the excellent paper and for providing the code! I have been trying to reproduce the results from Figure 8 of the paper using LLaMa-7B and LLaMa-13B and the TriviaQA dataset I downloaded using the command in ReadMe. 
However, I get the following values:

7B:
0 docs: 50.8, 1 doc: 54.1, 2 docs: 55.9, 3 docs: 56.4

13B:
0 docs: 57.8, 1 doc: 58.8, 2 docs: 59.8, 3 docs: 60.4

Can you please provide some insights/information that explains this discrepancy? 
(The numbers for 1-3 documents are similar but there is a ~3% gap for 0 documents.)

![Screenshot 2024-07-23 at 12 29 37 PM](https://github.com/user-attachments/assets/f6750b23-43ef-41cc-bf75-d5927fae8dc2)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reproducing the results with LLaMa and TriviaQA (Figure 8) #12

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reproducing the results with LLaMa and TriviaQA (Figure 8) #12

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions