MLflow for Darts implementation by jakubchlapek · Pull Request #3022 · unit8co/darts

jakubchlapek · 2026-02-18T14:11:27Z

Checklist before merging this PR:

Mentioned all issues that this PR fixes or addresses.
Summarized the updates of this PR under Summary.
Added an entry under Unreleased in the Changelog.

Addresses #2092 .

Summary

Provides a custom MLflow flavor for Darts on Darts' side. Supports autologging, logging, saving and loading of the models.
This PR focuses on the base MLflow integration, leaving serving of the models to be discussed in the future.

Included an example quickstart for the integration, however consider all of this a draft :)
Find example code in the .ipynb, however also providing a code snippet here as a quick reproducible example:

import mlflow
import tempfile
import os
from darts.metrics.metrics import smape
from darts.utils.mlflow import load_model, autolog
from darts.models import NBEATSModel, LinearRegressionModel
from darts.datasets import AirPassengersDataset
from torchmetrics import MeanAbsoluteError

# temp file setup
tmpdir = tempfile.mkdtemp()
mlflow_db = os.path.join(tmpdir, "mlflow.db")
mlflow.set_tracking_uri(f"sqlite:///{mlflow_db}")
mlflow.set_experiment("darts-forecasting")

train, val = AirPassengersDataset().load().astype("float32").split_before(0.7)

# autologging - patches .fit() on all ForecastingModel subclasses.
# for PyTorch-based models, inject_per_epoch_callbacks injects a Lightning callback
# that logs train/val loss or/and  user-specified torch metrics at the end of each epoch automatically.
autolog(
    log_models=True,
    log_params=True,
    log_training_metrics=True,
    log_validation_metrics=True,   # requires val_series in .fit()
    inject_per_epoch_callbacks=True, 
    extra_metrics=[smape],         # optional extra darts metric functions
)

with mlflow.start_run(run_name="nbeats") as run:
    model = NBEATSModel(
        input_chunk_length=24, 
        output_chunk_length=12,
        torch_metrics=MeanAbsoluteError())
    # val_series is forwarded to Lightning's val_dataloaders;
    # autolog captures per-epoch val metrics via the injected callback
    model.fit(train, val_series=val, epochs=10)
    run_id = run.info.run_id


# regression/sklearn models work identically
with mlflow.start_run(run_name="linreg"):
    model = LinearRegressionModel(lags=12)
    model.fit(train)  # logs params + in-sample metrics

# load back from MLflow
loaded = load_model(f"runs:/{run_id}/model")
preds = loaded.predict(12, series=train) # need to specify series as we save with clean=True in save_model

# import shutil
# shutil.rmtree(tmpdir)

review-notebook-app · 2026-02-18T14:11:34Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

jakubchlapek · 2026-02-18T14:16:18Z

Hey @daidahao, adding this draft PR in the meantime so you and @dennisbader can have a look at what I have currently regarding the integration. There are still some decisions I am not too thrilled about and decisions to be made about the overall direction, but I'm happy to talk more about it during the meeting. Thanks for being so active for the library, really nice to be working together :)

daidahao · 2026-02-26T23:12:05Z

Hi @jakubchlapek

Sorry to hear that. Thank you for all your work and I can attest that you absolutely built a solid foundation for MLflow in Darts!

daidahao · 2026-03-02T21:36:02Z

@jakubchlapek @dennisbader @mizeller

I've reviewed and provided some comments here for the code (except for unit tests and notebook). Overall, I agree with most of the design choices by @jakubchlapek and there are only a few deviations from mine (e.g., using MLflow existing APIs, Darts raise_log(), logging fit() params) which I believe could benefit the current impl..

I wonder if it would be easier if @jakubchlapek could allow me to edit the code directly to address my comments here? That way, @mizeller can then continue the work here with the best of both worlds, and focus on unit tests or extension, etc. Let me know what you think.

Once again, thank you for the great work @jakubchlapek and I am glad that our impl. align so well.

mizeller · 2026-03-03T08:10:41Z

@daidahao Perfect timing - I wrapped my head around @jakubchlapek's implementation & had a short meeting w/ @dennisbader last Friday about what needs to be done.

I did fork off of @jakubchlapek's branch here.

You can add your changes there and we can merge into @jakubchlapek's branch when we're done (or wait for him to give you access to his branch.) Just added you as collaborator there. Whatever you prefer :)

I'll try to squeeze this PR in this week and ping you here if something's unclear if that's okay!

jakubchlapek · 2026-03-03T09:43:55Z

Hey @daidahao, @mizeller. I have added you both as collaborators now, so you should be able to edit the code directly here.

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

darts/utils/utils.py

+try:
+    import pytorch_lightning as pl  # noqa: F401
+
+    PL_AVAILABLE = True


pyproject.toml

 ]
 notorch = [
-    "catboost>=1.0.6",
+    "catboost>=1.0.6,<=1.2.9",


daidahao · 2026-03-02T14:18:08Z

pyproject.toml

    "statsforecast>=1.4",
    "xgboost>=2.1.4",
 ]
+mlflow = ["mlflow>=2.0"]


I would like a discussion on the new option here. My understanding is that users who would need the Darts-MLflow integration probably have MLFlow installed already and set up properly. For users who have not, MLflow itself has options for databricks, which some users might find useful. Could we instead direct users to MLflow official guide for installation?

Also, @MichaelVerdegaal raised a suggestion for mlflow>=3.0. I have not used MLflow 2.x before but I think the minimum version should be deliberated as well.

daidahao · 2026-03-02T20:41:38Z

darts/utils/mlflow.py

+        does not apply the `with_managed_run` wrapper to the specified
+        `patch_function`.
+    """
+    # Enable/disable mlflow.pytorch.autolog for per-epoch metrics on torch models.


Sorry, I don't understand why the decorator would short-circuit here if we call mlflow.pytorch.autolog() with disable=True. Looking at XGBoost flavour, it seems they are able to call mlflow.sklearn._autolog() within mlflow.xgboost.autolog(). Is it because mlflow.sklearn._autolog() is not wrapped but mlflow.pytorch.autolog() is?

darts/utils/mlflow.py

+
+    classes_to_patch = [ForecastingModel]
+
+    for subclass in get_all_subclasses(ForecastingModel):


darts/utils/mlflow.py

+
+def log_model(
+    model,
+    artifact_path: str | None = None,


darts/utils/mlflow.py

+    log_models: bool = True,
+    log_params: bool = True,
+    log_metrics: bool = True,
+    inject_per_epoch_callbacks: bool = True,


darts/utils/mlflow.py

+        A list of pip requirement strings.
+    """
+    reqs = [_get_pinned_requirement("darts")]
+    if is_torch:


darts/utils/mlflow.py

+    if code_dir_subpath is not None:
+        darts_flavor_conf["code"] = code_dir_subpath
+
+    default_reqs = None if pip_requirements else get_default_pip_requirements(is_torch)


darts/utils/mlflow.py

+    bool
+        True if the model is a TorchForecastingModel, False otherwise.
+    """
+    try:


Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

daidahao · 2026-03-07T18:21:19Z

@mizeller @jakubchlapek @dennisbader

Greetings! I've addressed most of the comments here, except for a few discussion points.

I've left a TODO note on post-fitting metrics which, IMHO, are HARD to implement at this point due to how MLflow manages active runs in autolog context. In short, we would need to keep a mapping between MLflow run ids, fitted models, model predictions, and metrics, to ensure the metrics are logged under the right run id (see mlflow.sklearn).

Sincere apologies for suggesting post-fitting metrics in the first place! I didn't realise the complexity involved.

My suggestion is to skip post-fitting metrics for now or settle for compromises such as non-terminated active runs (at the risk of cross-logging).

Other than that, I am truly proud of what we have achieved here and will hand this over to @mizeller for runups and more great work.

jakubchlapek added 24 commits February 10, 2026 16:01

draft MVP

5789a99

covariate support

dc48b97

pyfunc series info extension

0cc488e

unit tests

b6ce538

add mlflow to dependencies

605ecbe

changed to sqlite

3c2a2e2

kwargs pyfunc extension

1750b11

removing pyfunc draft support: tbd if to include in future

30456f5

slight refactor by leveraging built-in mflflow validation util methods

b69de4a

restructuring module

c488cd7

autologging refactor w/ mlflow decorator

7fe3cf3

refactoring log_model to leverage Model.log

ada47b7

unit test improvenemnts

9fbe42a

ForecastingModel subclasses handling for autolog

a0e663f

save models with clean=True

93ca22a

logging update

ac4b0c6

unused var, tfmodel.load handles .ckpt internally

a35a834

documentation

2695cc9

added autolog logging default/provided metrics for all models

662b8e3

autolog metric unit tests

f19d6bf

removed redundant tests

bd932a9

changed callback inject to true by default

3b0ecb2

feat: ensure contiguous tensors in metric updates

2d6158d

example quickstart for mlflow

f684041

Merge branch 'master' into feat/mlflow-base

4495115

jakubchlapek added 3 commits February 18, 2026 16:47

unit test mps fix for torch

894779b

typehinting fix

c9a1301

CI hotfix

a65612d

daidahao added 3 commits March 5, 2026 09:34

Merge branch 'master' into feat/mlflow-base

a7e4646

Revert catboost cap

99f161c

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Replace raise_if and raise_if_not

1c7b073

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

daidahao reviewed Mar 5, 2026

View reviewed changes

daidahao added 7 commits March 5, 2026 09:47

Use abs path

e885b64

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Use direct path for saving model

b46cf95

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Update unit8co#2

51a47f2

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Use mlflow class utils to replace insepctlib

28c11ed

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Fix tests

ddae08a

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Get all forecasting models ith inspect

91f6660

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Rename log_torch_metrics

ce56906

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

daidahao suggested changes Mar 5, 2026

View reviewed changes

daidahao and others added 13 commits March 5, 2026 14:52

Change save_model param order

3dcceac

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Update log_model params

1b7fa89

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Remove log_params from log_model()

b8a4f8d

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Update unit tests

331c4c3

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Simplify _is_torch_model

2396f16

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Move _patched_fit inside _autolog()

7646dde

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Handle single series in metric patching

948ab56

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Update covariate logging logic

6ad82f7

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Update tests and wait for completion

abf24f5

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Remove PL_AVAILABLE flag

774a222

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

Add post-fitting metric TODO note

3c9296f

Co-authored-by: Zhihao Dai <zhihao.dai@eng.ox.ac.uk>

chore: rename mlflow jupyter from 26 to 27

a7bca2e

chore: fix jupyter for mlflow; work in progress

eb8f8cd


		classes_to_patch = [ForecastingModel]

		for subclass in get_all_subclasses(ForecastingModel):

Conversation

jakubchlapek commented Feb 18, 2026

Summary

Uh oh!

review-notebook-app bot commented Feb 18, 2026

Uh oh!

jakubchlapek commented Feb 18, 2026

Uh oh!

daidahao commented Feb 26, 2026

Uh oh!

daidahao commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mizeller commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jakubchlapek commented Mar 3, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

daidahao Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

daidahao Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

daidahao Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

daidahao commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

daidahao commented Mar 2, 2026 •

edited

Loading

mizeller commented Mar 3, 2026 •

edited

Loading