Skip to content

Conversation

FacerAin
Copy link
Contributor

@FacerAin FacerAin commented Sep 14, 2025

Hi! 😁

I'm using ColPali models to evaluate ViDoRe benchmarks and encountered an AttributeError. The issue occurs because ColPaliEngineWrapper.similarity() tries to use self.processor_kwargs, which is not defined in the class. I've temporarily resolved this by removing the **self.processor_kwargs reference.

Traceback (most recent call last):
  File "/KoVidore-benchmark/kovidore/evaluate.py", line 495, in run_benchmark
    results = evaluation.run(model, output_folder=f"results/{model_name}", encode_kwargs = {'batch_size': batch_size})
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/KoVidore-benchmark/.venv/lib/python3.12/site-packages/mteb/evaluation/MTEB.py", line 672, in run
    raise e
  File "/KoVidore-benchmark/.venv/lib/python3.12/site-packages/mteb/evaluation/MTEB.py", line 625, in run
    results, tick, tock = self._run_eval(
                          ^^^^^^^^^^^^^^^
  File "/KoVidore-benchmark/.venv/lib/python3.12/site-packages/mteb/evaluation/MTEB.py", line 307, in *run*eval
    results = task.evaluate(
              ^^^^^^^^^^^^^^
  File "/KoVidore-benchmark/.venv/lib/python3.12/site-packages/mteb/abstasks/Image/AbsTaskAny2AnyRetrieval.py", line 329, in evaluate
    scores[hf_subset] = self._evaluate_subset(
                        ^^^^^^^^^^^^^^^^^^^^^^
  File "/KoVidore-benchmark/.venv/lib/python3.12/site-packages/mteb/abstasks/Image/AbsTaskAny2AnyRetrieval.py", line 338, in *evaluate*subset
    results = retriever(corpus, queries)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/KoVidore-benchmark/.venv/lib/python3.12/site-packages/mteb/evaluation/evaluators/Image/Any2AnyRetrievalEvaluator.py", line 317, in **call**
    return self.retriever.search(
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/KoVidore-benchmark/.venv/lib/python3.12/site-packages/mteb/evaluation/evaluators/Image/Any2AnyRetrievalEvaluator.py", line 231, in search
    cos_scores = score_function(query_embeddings, sub_corpus_embeddings)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/KoVidore-benchmark/.venv/lib/python3.12/site-packages/mteb/models/colpali_models.py", line 134, in similarity
    return self.processor.score(a, b, **self.processor_kwargs)
                                        ^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'ColPaliWrapper' object has no attribute 'processor_kwargs'

This issue likely stems from this discussion.
The ColPali Engine's score_multi_vector method can accept optional arguments like batch_size and device. However, should we allow passing these parameters for the similarity scoring method at the ColPaliEngineWrapper level? I believe this is conceptually different from the batch size and device used for embedding generation. So I'm not sure where these arguments should be received and passed through. Any ideas or suggestions would be welcome!

Thanks!!

@Samoed
Copy link
Member

Samoed commented Sep 15, 2025

Hm, I'm not sure too. I don't think that we need to pass additional parameters because it should be fast, but maybe I'm wrong

@KennethEnevoldsen
Copy link
Contributor

KennethEnevoldsen commented Sep 18, 2025

Sorry for the slow reply here, had a few things on my plate

Hmm seems like we should just define self.processor_kwargs to be {"device": self.device} in case it is not specified. I don't see any notable things you could pass here that would be problematic (but do let me know if I am missing something)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants