Skip to content

Conversation

@kindler-king
Copy link

Metadata

Details

This PR adds a client-side URL length validation step to _send_request() in openml/_api_calls.py.
The goal is to prevent extremely long URLs from being sent to the server and to ensure that
test_too_long_uri behaves as expected.

🧩 What problem does this fix?

Currently, when a very long list of data_ids is provided (e.g., 10,000 IDs):

  1. The constructed URL becomes extremely long.
  2. The client sends the request to the server.
  3. The server returns HTTP 414 (“Request-URI Too Long”) with an HTML error page.
  4. This HTML triggers an XML parse error, which is wrapped into a generic:

🛠️ What does this PR change?

This PR adds a fail-fast check before any network request is made:

MAX_URL_LENGTH = 2000
if len(url) > MAX_URL_LENGTH:
 raise OpenMLServerError("URI too long!")

This ensures:

  1. No server request is made
  2. Deterministic client behavior regardless of server config
  3. Test test_too_long_uri passes
  4. Better user experience and reduced server load
  5. The fallback 414 handling in __parse_server_exception() remains unchanged.

✔ Test Status

  1. test_too_long_uri now passes.
  2. Full test suite passes locally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Adding client-side URL length validation to fix failing test_too_long_uri

1 participant