Skip to content

[ENH] add list_metrics and compute_metric MCP tools (#79)#137

Open
rupeshca007 wants to merge 2 commits intosktime:mainfrom
rupeshca007:feat/list-metrics-compute-metric
Open

[ENH] add list_metrics and compute_metric MCP tools (#79)#137
rupeshca007 wants to merge 2 commits intosktime:mainfrom
rupeshca007:feat/list-metrics-compute-metric

Conversation

@rupeshca007
Copy link
Copy Markdown
Contributor

Summary

Closes #79
Adds two new MCP tools that expose sktime's forecasting performance metrics to LLM agents:

  • list_metrics - returns a catalogue of all supported metrics (name, description, lower_is_better, scale_dependent, requires_y_train).
    • compute_metric - evaluates any metric from the catalogue on explicit y_true / y_pred list inputs. Scale-normalised metrics (MASE, RMSSE) additionally accept a y_train argument.

Supported Metrics

Name Description Scale-dependent Requires y_train
MAE Mean Absolute Error yes
MSE Mean Squared Error yes
RMSE Root Mean Squared Error yes
MAPE Mean Absolute Percentage Error
SMAPE Symmetric MAPE
MASE Mean Absolute Scaled Error yes
RMSSE Root Mean Squared Scaled Error yes
MedAE Median Absolute Error yes
MedSE Median Squared Error yes
MedRMSE Median Root Mean Squared Error yes
MedAPE Median Absolute Percentage Error
MedSAPE Symmetric Median APE
GMAE Geometric Mean Absolute Error yes
GMSE Geometric Mean Squared Error yes
GRMSE Geometric Root Mean Squared Error yes

Files Changed

File Purpose
src/sktime_mcp/tools/metrics.py New file - tool implementations
src/sktime_mcp/tools/__init__.py Export new tools
src/sktime_mcp/server.py Register tool schemas + dispatch
tests/test_metrics.py 22 tests (8 pass locally, 14 skipped when sktime not installed; all 22 run on CI)

How LLMs Can Use This

# Step 1: discover what metrics exist
list_metrics()

# Step 2: evaluate a forecast
compute_metric(metric="RMSE", y_true=[100,110,120], y_pred=[98,112,118])
# -> {"success": true, "metric": "RMSE", "value": 2.16, "lower_is_better": true}

# Step 3: MASE needs training data for normalisation
compute_metric(metric="MASE", y_true=[...], y_pred=[...], y_train=[...])

Checklist

  • New tool file follows existing patterns (tools/describe_estimator.py, tools/evaluate.py)
  • - [x] Registered in list_tools() with full JSON schema (including required fields)
  • - [x] Dispatch added to call_tool()
  • - [x] Tests written for both happy-path and error cases
  • - [x] pytest.mark.skipif used (not importorskip) so list_metrics tests always run without sktime installed
  • [ ]

@rupeshca007 rupeshca007 changed the title feat: add list_metrics and compute_metric MCP tools [ENH] add list_metrics and compute_metric MCP tools (#79) Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] Full Forecasting Evaluation Framework: flexible CV, metric discovery, and train/test split tools

1 participant