Skip to content

Conversation

@Jitmisra
Copy link

Metadata

Details

What does this PR implement/fix?

This PR adds three new CLI subcommands under openml models to improve the user experience of the model catalogue:

  1. openml models list - List flows (models) with optional filtering (tag, uploader, pagination, output format)
  2. openml models info <flow_id> - Display detailed information about a specific flow
  3. openml models search <query> - Search flows by name with case-insensitive matching

Why is this change necessary? What is the problem it solves?

Currently, users must write Python code to browse or search OpenML flows, even for simple tasks like listing available models or finding a specific model. This creates a barrier to entry and makes the model catalogue less accessible. Adding CLI commands allows users to interact with the model catalogue directly from the command line without writing code.

This directly addresses the ESoC 2025 goal of "Improving user experience of the model catalogue in AIoD and openML".

How can I reproduce the issue this PR is solving and its solution?

Before (requires Python code):
import openml
flows = openml.flows.list_flows(size=10)
for _, row in flows.iterrows():
print(row['name'])

After (CLI commands):

List first 10 flows

openml models list --size 10

Search for RandomForest models

openml models search RandomForest

Get detailed info about a model

openml models info 12345

List models with a specific tag

openml models list --tag sklearn --format table --verboseImplementation Details:

  • Added three new functions in openml/cli.py: models_list(), models_info(), and models_search()
  • Integrated into main CLI parser with proper argument handling
  • Added comprehensive test suite (6 test cases) in tests/test_openml/test_cli.py
  • Uses existing openml.flows.list_flows() and openml.flows.get_flow() functions - no changes to core API
  • Follows existing CLI patterns (similar to configure command)
  • All tests use mocked API calls to avoid requiring server connections

Any other comments?

  • All pre-commit hooks pass (ruff, mypy, formatting)
  • No breaking changes
  • Follows project code style and patterns
  • Ready for review

@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 5.10204% with 93 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.26%. Comparing base (4b1bdf4) to head (593edd0).

Files with missing lines Patch % Lines
openml/cli.py 5.10% 93 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           develop    #1487       +/-   ##
============================================
- Coverage    79.90%   52.26%   -27.65%     
============================================
  Files           36       36               
  Lines         4349     4443       +94     
============================================
- Hits          3475     2322     -1153     
- Misses         874     2121     +1247     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@joaquinvanschoren
Copy link
Contributor

Thanks! Can you please update it to use the name 'flow' for now? We might introduce a new concept 'model' that is somewhat different from a flow, and we should avoid any confusion here.

Also, would it be possible to generalize this to other entities, e.g. datasets?

So far, we were only using this cli for managing configurations. It makes sense to add a CLI, but longer term it would be better to have this as a seperate repo, so that people don't have to install openml-python and developers don't have to run the full openml-python test suite. It would also keep the openml-python repo more focussed.

For now, it would be ok to merge it here (after updating the naming) until it is more complete (with other entities like datasets) and then spin it out as a separate library.

@Jitmisra Jitmisra force-pushed the feature/cli-models-commands branch from 593edd0 to 6b4c645 Compare November 20, 2025 16:11
@Jitmisra
Copy link
Author

Jitmisra commented Nov 20, 2025

Thanks! Can you please update it to use the name 'flow' for now? We might introduce a new concept 'model' that is somewhat different from a flow, and we should avoid any confusion here.

Also, would it be possible to generalize this to other entities, e.g. datasets?

So far, we were only using this cli for managing configurations. It makes sense to add a CLI, but longer term it would be better to have this as a seperate repo, so that people don't have to install openml-python and developers don't have to run the full openml-python test suite. It would also keep the openml-python repo more focussed.

For now, it would be ok to merge it here (after updating the naming) until it is more complete (with other entities like datasets) and then spin it out as a separate library.

Thanks for the review!...... I have updated the CLI to use “flow” terminology everywhere and added a flows namespace..... I also added a datasets namespace (list/info/search) so the CLI is extensible to other entities. openml/cli.py now exposes openml flows ... and openml datasets ... and the parser is organized so more entities can be added later......
i would be happy if you have any other things in mind on which i can work upon.....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants