Skip to content

[ENH] Add a dataset statistical analysis tool (e.g., analyze_data) #195

@prashu0705

Description

@prashu0705

Is your feature request related to a problem? Please describe.
Currently, sktime-mcp has excellent tools for data loading (data_tools), model instantiation, and evaluation (evaluate.py), but it lacks a tool for an agent to natively query the statistical characteristics of a loaded time series (such as stationarity, trend, and seasonality). In agentic workflows, an LLM often needs to analyze a time series before deciding which transformations or forecasting estimators to build.

Describe the solution you'd like
I propose adding an analyze_data.py (or describe_data.py) tool. Given a data_handle, this tool would return a statistical summary including:

  • Series length and frequency
  • Stationarity (via adfuller test)
  • Presence of trend (e.g., using linear regression slope)
  • Presence and strength of seasonality (e.g., using ACF)

Describe alternatives you've considered
Agents currently have to blindly guess data properties or ask the user to manually compute and provide these statistics.

Additional context
I am currently working on a custom agentic forecaster for the sktime ESoC 2026 track, and I have already written the logic to compute these statistics locally using standard sktime/statsmodels checks. I would love to contribute this as a new MCP tool via a Pull Request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions