Skip to content

Commit ddee0d5

Browse files
committed
Updating CHANGELOG, changing PCovC fit() note
1 parent 78e7051 commit ddee0d5

File tree

6 files changed

+30
-29
lines changed

6 files changed

+30
-29
lines changed

CHANGELOG

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,10 @@ The rules for CHANGELOG file:
1313

1414
0.3.0 (XXXX/XX/XX)
1515
------------------
16+
- Add ``_BasePCov`` class (#248)
17+
- Add ``PCovC`` class that inherits shared functionality from ``_BasePCov`` (#248)
18+
- Add ``PCovC`` testing suite and examples (#248)
19+
- Modify ``PCovR`` to inherit shared functionality from ``_BasePCov_`` (#248)
1620
- Update to sklearn >= 1.6.0 and scipy >= 1.15.0 (#239)
1721
- Fixed moved function import from scipy and bump scipy dependency to 1.15.0 (#236)
1822
- Fix rendering issues for `SparseKDE` and `QuickShift` (#236)

docs/src/bibliography.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,3 +45,9 @@ References
4545
Michele Ceriotti, "Improving Sample and Feature Selection with Principal Covariates
4646
Regression" 2021 Mach. Learn.: Sci. Technol. 2 035038.
4747
https://iopscience.iop.org/article/10.1088/2632-2153/abfe7c.
48+
49+
.. [Jorgensen2025]
50+
Christian Jorgensen, Arthur Y. Lin, and Rose K. Cersonsky,
51+
"Interpretable Visualizations of Data Spaces for Classification Problems"
52+
2025 arXiv. 2503.05861
53+
https://doi.org/10.48550/arXiv.2503.05861.

docs/src/references/decomposition.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
Principal Covariates Regression (PCovR) and Classification (PCovC)
2-
==================================================================
1+
Hybrid Mapping Techniques (PCovR and PCovC)
2+
===========================================
33

44
.. _PCovR-api:
55

src/skmatter/decomposition/__init__.py

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -25,19 +25,16 @@
2525
original PCovR method, proposed in [Helfrecht2020]_.
2626
"""
2727

28-
from ._pcov import _BasePCov, pcovr_covariance, pcovr_kernel
28+
from ._pcov import _BasePCov
2929

3030
from ._pcovr import PCovR
31-
from ._kernel_pcovr import KernelPCovR
32-
3331
from ._pcovc import PCovC
3432

33+
from ._kernel_pcovr import KernelPCovR
3534

3635
__all__ = [
3736
"_BasePCov",
38-
"pcovr_covariance",
39-
"pcovr_kernel",
4037
"PCovR",
41-
"KernelPCovR",
4238
"PCovC",
39+
"KernelPCovR",
4340
]

src/skmatter/decomposition/_pcovc.py

Lines changed: 13 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,8 @@
2020

2121

2222
class PCovC(LinearClassifierMixin, _BasePCov):
23-
r"""Principal Covariates Classification determines a latent-space projection :math:`\mathbf{T}`
23+
r"""Principal Covariates Classification, as described in [Jorgensen2025]_,
24+
determines a latent-space projection :math:`\mathbf{T}`
2425
which minimizes a combined loss in supervised and unsupervised tasks.
2526
2627
This projection is determined by the eigendecomposition of a modified gram
@@ -219,8 +220,16 @@ def __init__(
219220
self.classifier = classifier
220221

221222
def fit(self, X, Y, W=None):
222-
r"""Fit the model with X and Y. Depending on the dimensions of X,
223-
calls either `_fit_feature_space` or `_fit_sample_space`.
223+
r"""Fit the model with X and Y. Note that W is taken from the
224+
coefficients of a linear classifier fit between X and Y to compute
225+
Z:
226+
227+
.. math::
228+
\mathbf{Z} = \mathbf{X} \mathbf{W}
229+
230+
We then call either `_fit_feature_space` or `_fit_sample_space`,
231+
using Z as our approximation of Y. Finally, we refit a classifier on
232+
T and Y to obtain :math:`\mathbf{P}_{TZ}`.
224233
225234
Parameters
226235
----------
@@ -237,24 +246,9 @@ def fit(self, X, Y, W=None):
237246
Training data, where n_samples is the number of samples.
238247
239248
W : numpy.ndarray, shape (n_features, n_properties)
240-
Classification weights, optional when classifier=`precomputed`. If
249+
Classification weights, optional when classifier= `precomputed`. If
241250
not passed, it is assumed that the weights will be taken from a
242251
linear classifier fit between :math:`\mathbf{X}` and :math:`\mathbf{Y}`
243-
244-
Notes
245-
-----
246-
Note the relationship between :math:`\mathbf{X}`, :math:`\mathbf{Y}`,
247-
:math:`\mathbf{Z}`, and :math:`\mathbf{W}`. The classification weights
248-
:math:`\mathbf{W}`, obtained through a linear classifier fit between
249-
:math:`\mathbf{X}` and :math:`\mathbf{Y}`, are used to compute:
250-
251-
.. math::
252-
\mathbf{Z} = \mathbf{X} \mathbf{W}
253-
254-
Next, :math:`\mathbf{Z}` is used in either `_fit_feature_space` or
255-
`_fit_sample_space` as our approximation of :math:`\mathbf{Y}`.
256-
Finally, we refit a classifier on :math:`\mathbf{T}` and :math:`\mathbf{Y}`
257-
to obtain :math:`\mathbf{P}_{XZ}` and :math:`\mathbf{P}_{TZ}`
258252
"""
259253
X, Y = validate_data(self, X, Y, y_numeric=False)
260254
check_classification_targets(Y)

src/skmatter/decomposition/_pcovr.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99

1010

1111
class PCovR(RegressorMixin, MultiOutputMixin, _BasePCov):
12-
r"""Principal Covariates Regression, as described in [deJong1992]_
12+
r"""Principal Covariates Regression, as described in [deJong1992]_,
1313
determines a latent-space projection :math:`\mathbf{T}` which
1414
minimizes a combined loss in supervised and unsupervised tasks.
1515
@@ -225,7 +225,7 @@ def fit(self, X, Y, W=None):
225225
regressed form of the properties, :math:`{\mathbf{\hat{Y}}}`.
226226
227227
W : numpy.ndarray, shape (n_features, n_properties)
228-
Regression weights, optional when regressor=`precomputed`. If not
228+
Regression weights, optional when regressor= `precomputed`. If not
229229
passed, it is assumed that `W = np.linalg.lstsq(X, Y, self.tol)[0]`
230230
"""
231231
X, Y = validate_data(self, X, Y, y_numeric=True, multi_output=True)

0 commit comments

Comments
 (0)