Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
f092b02
updata typing for multi risk and check inputs
allglc Sep 29, 2025
1b0275f
Apply suggestion from @Copilot
Valentin-Laurent Oct 6, 2025
8fa0a75
FIX - error from copilot PR + error msg
Valentin-Laurent Oct 6, 2025
82fed0a
WIP - pair programming
Valentin-Laurent Oct 6, 2025
0f46fea
test_check_risks_targets_same_len working
allglc Oct 7, 2025
0280ba1
fix format
allglc Oct 7, 2025
3fd973f
update test with correct function name and doc
allglc Oct 7, 2025
cb272b1
__init__ and _set_best_predict_param_choice handle multi risk
allglc Oct 7, 2025
6eefd86
update and rename _get_risks_and_effective_sample_sizes_per_param for…
allglc Oct 8, 2025
4092996
update _set_best_predict_param for multi risk
allglc Oct 8, 2025
ed0e02a
fix alpha
allglc Oct 8, 2025
ec8d406
ltt multi risk first iteration: use max p value
allglc Oct 8, 2025
95de147
self._risk is now always a list
allglc Oct 9, 2025
6ab8751
dirty fix for seemingly working version
allglc Oct 9, 2025
1aaf124
lists of len 1 = mono risk
allglc Oct 10, 2025
c7fd94e
add and update tests
allglc Oct 10, 2025
0d17ab7
slightly clarify testing code
allglc Oct 10, 2025
776ea81
fix linting, type-check
allglc Oct 10, 2025
534007e
code clarified - now need to merge binary case and original case
allglc Oct 10, 2025
de6c325
test failed -> modify test! updated to new behavior for binary: r_hat…
allglc Oct 10, 2025
fe5ee26
simplify ltt for binary
allglc Oct 13, 2025
f91b534
keep returning lists of lists for ltt
allglc Oct 13, 2025
c388ed4
small improvements
allglc Oct 16, 2025
189940d
add typng and a unit test
allglc Oct 16, 2025
92e1a5c
update ltt_procedure and its calls and tests for multi risk
allglc Oct 17, 2025
b52bfa6
DOC - Document tests philosophy, update and improve CONTRIBUTING.rst …
Valentin-Laurent Oct 13, 2025
f57cc8d
DOC: improve risk-control related doc and docstrings
Valentin-Laurent Jun 19, 2025
263fc04
DOC: improve risk-control related doc and docstrings
Valentin-Laurent Jul 8, 2025
e3ad648
DOC: improve risk-control related doc and docstrings
Valentin-Laurent Jul 8, 2025
194c4c6
FIX typo
Valentin-Laurent Sep 3, 2025
00ac012
DOC - Fix typos and typing, remove useless extensive docstrings from …
Valentin-Laurent Oct 14, 2025
ff9aaf4
Apply suggestion from @Copilot
Valentin-Laurent Oct 14, 2025
8c98979
Apply suggestion from @Copilot
Valentin-Laurent Oct 14, 2025
9f73bd5
ltt now fails when bad shape of inputs for multi risk
allglc Oct 20, 2025
ba4072e
add unit tests ltt multi risk
allglc Oct 20, 2025
0f53586
docstring ltt
allglc Oct 21, 2025
9078d06
linting
allglc Oct 21, 2025
ef5bd5d
ensure compatibility with python<3.10
allglc Oct 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 35 additions & 39 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,11 @@ Contribution guidelines
What to work on?
----------------

You are welcome to propose and contribute new ideas.
Issues tagged "Good first issue" are perfect for open-source beginners.

For the more experienced, issues tagged "Contributors welcome" are recommended if you want to help.

You are also welcome to propose and contribute to new ideas.
We encourage you to `open an issue <https://github.com/scikit-learn-contrib/MAPIE/issues>`_ so that we can align on the work to be done.
It is generally a good idea to have a quick discussion before opening a pull request that is potentially out-of-scope.

Expand Down Expand Up @@ -43,73 +47,65 @@ Finally, install ``mapie`` in development mode:

$ pip install -e .

Implementing your change
------------------------------------------

Documenting your change
-----------------------

If you're adding a public class or function, then you'll need to add a docstring with a doctest. We follow the `numpy docstring convention <https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html>`_, so please do too.
Any estimator should follow the `scikit-learn API <https://scikit-learn.org/stable/developers/develop.html>`_, so please follow these guidelines.

In order to build the documentation locally, you first need to create a different virtual environment than the one used for development, and then install some dependencies using ``pip`` with the following commands:
The linter must pass:

.. code-block:: sh

$ pip install -r requirements.doc.txt
$ pip install -e .
$ make lint

Finally, once dependencies are installed, you can build the documentation locally by running:
The typing must pass.

.. code-block:: sh

$ make clean-doc
$ make doc
$ make type-check


Updating changelog
------------------

You can make your contribution visible by:
Testing your change
---------------------

1. Adding your name to the Contributors section of `AUTHORS.rst <https://github.com/scikit-learn-contrib/MAPIE/blob/master/AUTHORS.rst>`_
2. If your change is user-facing (bug fix, feature, ...), adding a line to describe it in `HISTORY.rst <https://github.com/scikit-learn-contrib/MAPIE/blob/master/HISTORY.rst>`_
See `the tests README.md <https://github.com/scikit-learn-contrib/MAPIE/blob/master/mapie/tests/README.md>`_ for guidance.

The tests absolutely have to pass.

Testing
-------
.. code-block:: sh

Linting
^^^^^^^
$ make tests

These tests absolutely have to pass.
The coverage should absolutely be 100%.

.. code-block:: sh

$ make lint
$ make coverage

Documenting your change
-----------------------

Static typing
^^^^^^^^^^^^^
If you're adding a public class or function, then you'll need to add a docstring with a doctest. We follow the `numpy docstring convention <https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html>`_, so please do too.
Any estimator should follow the `scikit-learn API <https://scikit-learn.org/stable/developers/develop.html>`_, so please follow these guidelines.

These tests absolutely have to pass.
In order to build the documentation locally, you first need to create a different virtual environment than the one used for development, and then install some dependencies using ``pip`` with the following commands:

.. code-block:: sh

$ make type-check


Unit tests
^^^^^^^^^^
$ pip install -r requirements.doc.txt
$ pip install -e .

These tests absolutely have to pass.
Finally, once dependencies are installed, you can build the documentation locally by running:

.. code-block:: sh

$ make tests
$ make clean-doc
$ make doc

Coverage
^^^^^^^^

The coverage should absolutely be 100%.
Updating changelog
------------------

.. code-block:: sh
You can make your contribution visible by:

$ make coverage
1. Adding your name to the Contributors section of `AUTHORS.rst <https://github.com/scikit-learn-contrib/MAPIE/blob/master/AUTHORS.rst>`_
2. If your change is user-facing (bug fix, feature, ...), adding a line to describe it in `HISTORY.rst <https://github.com/scikit-learn-contrib/MAPIE/blob/master/HISTORY.rst>`_
2 changes: 1 addition & 1 deletion doc/theoretical_description_risk_control.rst
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ Let's first give the settings and the notations of the method:
- Let :math:`R` be the risk associated to a set-valued predictor:

.. math::
R(\mathcal{T}_{\hat{\lambda}}) = \mathbb{E}[L(Y, \mathcal{T}_{\lambda}(X))]
R(\mathcal{T}_{\lambda}) = \mathbb{E}[L(Y, \mathcal{T}_{\lambda}(X))]

The goal of the method is to compute an Upper Confidence Bound (UCB) :math:`\hat{R}^+(\lambda)` of :math:`R(\lambda)` and then to find
:math:`\hat{\lambda}` as follows:
Expand Down
41 changes: 29 additions & 12 deletions mapie/control_risk/ltt.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import warnings
from typing import Any, List, Tuple, Union
from typing import Any, List, Tuple

import numpy as np

Expand All @@ -12,7 +12,7 @@ def ltt_procedure(
r_hat: NDArray,
alpha_np: NDArray,
delta: float,
n_obs: Union[int, NDArray],
n_obs: NDArray,
binary: bool = False,
) -> List[List[Any]]:
"""
Expand All @@ -24,28 +24,36 @@ def ltt_procedure(
- Apply a family wise error rate algorithm, here Bonferonni correction
- Return the index lambdas that give you the control at alpha level

Note that in the case of multi-risk, the arrays r_hat, alpha_np, and n_obs
should have the same length for the first dimension which corresponds
to the number of risks. In the case of a single risk, the length should be 1.

Parameters
----------
r_hat: NDArray of shape (n_lambdas, ).
r_hat: NDArray of shape (n_risks, n_lambdas).
Empirical risk with respect to the lambdas.
Here lambdas are thresholds that impact decision-making,
therefore empirical risk.

alpha_np: NDArray of shape (n_alpha, ).
alpha_np: NDArray of shape (n_risks, n_alpha).
Contains the different alphas control level.
The empirical risk should be less than alpha with
probability 1-delta.
Note: MAPIE 1.2 does not support multiple risks and multiple alphas
simultaneously.
For PrecisionRecallController, the shape should be (1, n_alpha).
For BinaryClassificationController, the shape should be (n_risks, 1).

delta: float.
Probability of not controlling empirical risk.
Correspond to proportion of failure we don't
want to exceed.

n_obs: Union[int, NDArray]
n_obs: NDArray of shape (n_risks, n_lambdas).
Correspond to the number of observations used to compute the risk.
In the case of a conditional loss, n_obs must be the
number of effective observations used to compute the empirical risk
for each lambda, hence of shape (n_lambdas, ).
for each lambda.

binary: bool, default=False
Must be True if the loss associated to the risk is binary.
Expand All @@ -62,11 +70,19 @@ def ltt_procedure(
M. I., & Lei, L. (2021). Learn then test:
"Calibrating predictive algorithms to achieve risk control".
"""
p_values = compute_hoeffding_bentkus_p_value(r_hat, n_obs, alpha_np, binary)
if not (r_hat.shape[0] == n_obs.shape[0] == alpha_np.shape[0]):
raise ValueError(
"r_hat, n_obs, and alpha_np must have the same length."
)
p_values = np.array([
compute_hoeffding_bentkus_p_value(r_hat_i, n_obs_i, alpha_np_i, binary)
for r_hat_i, n_obs_i, alpha_np_i in zip(r_hat, n_obs, alpha_np)
])
p_values = p_values.max(axis=0) # take max over risks (no effect if mono risk)
N = len(p_values)
valid_index = []
for i in range(len(alpha_np)):
l_index = np.where(p_values[:, i] <= delta/N)[0].tolist()
for i in range(alpha_np.shape[1]):
l_index = np.nonzero(p_values[:, i] <= delta/N)[0].tolist()
valid_index.append(l_index)
return valid_index

Expand Down Expand Up @@ -95,7 +111,7 @@ def find_lambda_control_star(
the empirical risk is less than alpha.

valid_index: List[List[Any]].
Contain the valid index that satisfy fwer control
Contain the valid index that satisfy FWER control
for each alpha (length aren't the same for each alpha).

lambdas: NDArray of shape (n_lambda, )
Expand All @@ -104,7 +120,7 @@ def find_lambda_control_star(
Returns
-------
l_lambda_star: ArrayLike of shape (n_alpha, ).
The lambda that give the highest precision
The lambda that gives the minimum precision
for a given alpha.

r_star: ArrayLike of shape (n_alpha, ).
Expand All @@ -113,7 +129,8 @@ def find_lambda_control_star(
if [] in valid_index:
warnings.warn(
"""
Warning: At least one sequence is empty!
Warning: the risk couldn't be controlled for at least one value of alpha.
The corresponding lambdas have been set to 1.
"""
)
l_lambda_star = [] # type: List[Any]
Expand Down
Loading
Loading