Fix resolve chat completion URL #2540

Wauplin · 2024-09-13T14:36:53Z

With the introduction of base_url in #2384 and a refacto in #2410, some bugs have been introduced on the logic to resolve the chat completion URL. This PR fixes definitely this with a dedicated _resolve_chat_completion_url method. I added a bunch of parametrized tests to check every possible cases. It is now possible to:

pass a model id as InferenceClient(model=...)
pass a model id as InferenceClient().chat_completion(model=...)
pass a url as InferenceClient(model=...), InferenceClient(base_url=...) or InferenceClient().chat_completion(model=...)
- the URL can have a trailing / or not
- the URL can end with /v1 or not (for OpenAI compatibility)
- the URL can end with /v1/chat/completions or not
- the URL can be a local TGI instance or an Inference Endpoint url
pass a url to InferenceClient and a model_id in chat_completion (already the case before)

With all of these use cases tested and working, I think this is fixed for good :)

Two minor unrelated changes:

some typing: ignore to fix CI (after Implemented https://github.com/huggingface/huggingface_hub/issues/2516 #2532)
Better error in src/huggingface_hub/utils/_http.py (after oversight in Refacto error parsing (HfHubHttpError) #2474)

Failing test is unrelated.

Wauplin · 2024-09-13T14:37:50Z

@MoritzLaurer FYI, I've made a full example for Inference Endpoints => InferenceClient and it works well:

from huggingface_hub import get_inference_endpoint


endpoint = get_inference_endpoint("smollm-360m-instruct-hgl")  # need to start an IE and copy its name
client = endpoint.resume().wait().client


output = client.chat_completion(
    [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Count to 10"},
    ]
)

print(output.choices[0].message.content)

hanouticelina

Looks good to me! thanks @Wauplin! Nice test coverage 😄

Wauplin · 2024-09-16T07:41:19Z

Thanks for the review! Failing tests are unrelated so I'll merge this one :)

Wauplin added 2 commits September 13, 2024 15:33

Add tests for test_resolve_chat_completion_url

67d2d2d

Fix passing chat completion url

2ed7697

Wauplin requested review from LysandreJik, MoritzLaurer and hanouticelina September 13, 2024 14:36

hanouticelina approved these changes Sep 13, 2024

View reviewed changes

Wauplin merged commit a49ca75 into main Sep 16, 2024
15 of 19 checks passed

Wauplin deleted the 2484-fix-passing-chat-completion-url branch September 16, 2024 07:41

Wauplin mentioned this pull request Sep 16, 2024

Passing a HF endpoint URL to client.chat_completion() doesn't seem to work anymore #2484

Closed

Wauplin mentioned this pull request May 6, 2025

[Inference Providers] fix inference with URL endpoints #3041

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix resolve chat completion URL #2540

Fix resolve chat completion URL #2540

Uh oh!

Wauplin commented Sep 13, 2024 •

edited

Loading

Uh oh!

Wauplin commented Sep 13, 2024

Uh oh!

hanouticelina left a comment

Uh oh!

Wauplin commented Sep 16, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix resolve chat completion URL #2540

Fix resolve chat completion URL #2540

Uh oh!

Conversation

Wauplin commented Sep 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Wauplin commented Sep 13, 2024

Uh oh!

hanouticelina left a comment

Choose a reason for hiding this comment

Uh oh!

Wauplin commented Sep 16, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Wauplin commented Sep 13, 2024 •

edited

Loading