Skip to content

Conversation

Samoed
Copy link
Member

@Samoed Samoed commented Sep 13, 2025

Ref #2714

  • Aligned task._evaluate_subset with each other
  • Aligned Evaluator.__call__, but classification and regression task return 2 items instead of 1

@Samoed Samoed added the v2 Issues and PRs related to `v2` branch label Sep 13, 2025
# Conflicts:
#	mteb/abstasks/AbsTask.py
#	mteb/models/model_implementations/listconranker.py
#	pyproject.toml
@Samoed Samoed changed the title [v2] Add start mypy [v2] Add start typecheking Sep 13, 2025
@Samoed Samoed changed the title [v2] Add start typecheking [v2] Start type checking Sep 13, 2025
@Samoed Samoed requested a review from Copilot September 14, 2025 19:51
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces initial type checking setup for the MTEB project using mypy as referenced in issue #2714. The changes focus on establishing a basic type checking configuration and addressing fundamental typing issues across the codebase.

  • Adds mypy configuration and type stub dependencies for comprehensive type checking
  • Updates function signatures to include proper type annotations and return types
  • Refactors descriptive statistics type hierarchy to support better type safety

Reviewed Changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pyproject.toml Adds mypy dependency group and configuration with module overrides
mteb/types/statistics.py Refactors statistics type hierarchy with new base class
mteb/create_dataloaders.py Updates function signatures and removes **kwargs for explicit parameters
mteb/abstasks/*.py Updates abstract task classes to use proper type annotations
mteb/_evaluators/*.py Adds return type annotations to evaluator methods
mteb/models/model_implementations/listconranker.py Removes optional parameter type
mteb/types/_encoder_io.py Adds type ignore comment for multiple inheritance

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

# Conflicts:
#	mteb/_evaluators/clustering_evaluator.py
#	mteb/abstasks/AbsTaskAnyClustering.py
#	mteb/abstasks/AbsTaskMultilabelClassification.py
encode_kwargs: dict[str, Any],
test_cache: np.ndarray | None = None,
) -> tuple[dict[str, float], Any]:
) -> tuple[dict[str, float], np.ndarray | None]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a docstring would be good here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

{include-group = "lint"},
{include-group = "test"},
]
typing = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us add a makefile command as well

[tool.mypy]
plugins = ['pydantic.mypy']

[[tool.mypy.overrides]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add some comments to these overrides - not quite sure what they do

task_metadata: TaskMetadata,
input_column: str | None = None,
**dataloader_kwargs: dict[str, Any],
batch_size: int = 32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not dataloader_kwargs?

Copy link
Member Author

@Samoed Samoed Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need them, because only batch_size is helpful for us, I think https://docs.pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about num_workers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we require this, because we don't do something tricky, but I will test this

# Conflicts:
#	mteb/abstasks/AbsTaskRetrieval.py
#	mteb/abstasks/AbsTaskTextRegression.py
#	mteb/abstasks/aggregated_task.py
@Samoed Samoed merged commit 01a86e9 into v2.0.0 Sep 19, 2025
9 checks passed
@Samoed Samoed deleted the add_start_mypy branch September 19, 2025 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v2 Issues and PRs related to `v2` branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants