Skip to content

Conversation

@mgiannakopoulos
Copy link
Contributor

@mgiannakopoulos mgiannakopoulos commented Nov 26, 2025

Summary

WIth this change we:

  • Dynamically resolve authentication and obo tokens per LLM/Embedding request.
  • We allow passing through or resolving via env variables the hosting_platform and the endpoint_url of the provided InferenceAPI model.
  • Deprecate the http_client parameter in favor of timeout. In the next release we are going to remove it.
  • Deprecate the obo_token_getter parameter in favor of obo_token. In the next release we are going to remove it.

JIRA Issues

MCDB-86327
MCDB-86334

Copy link
Member

@nunogoncalves03 nunogoncalves03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Comment on lines +48 to +109
if callable(api_key):
api_key_getter_fn = api_key
else:
def api_key_getter_fn() -> Optional[str]:
if api_key is None:
return os.environ.get('SINGLESTOREDB_USER_TOKEN')
return api_key

if obo_token_getter is not None:
obo_token_getter_fn = obo_token_getter
else:
if callable(obo_token):
obo_token_getter_fn = obo_token
else:
def obo_token_getter_fn() -> Optional[str]:
return obo_token

# handle model info
if base_url is None:
base_url = os.environ.get('SINGLESTOREDB_INFERENCE_API_BASE_URL')
if hosting_platform is None:
hosting_platform = os.environ.get('SINGLESTOREDB_INFERENCE_API_HOSTING_PLATFORM')
if base_url is None or hosting_platform is None:
inference_api_manager = (
manage_workspaces().organizations.current.inference_apis
)
info = inference_api_manager.get(model_name=model_name)
else:
info = InferenceAPIInfo(
service_id='',
model_name=model_name,
name='',
connection_url=base_url,
project_id='',
hosting_platform=hosting_platform,
)
if base_url is not None:
info.connection_url = base_url
if hosting_platform is not None:
info.hosting_platform = hosting_platform

# Extract timeouts from http_client if provided
t = http_client.timeout if http_client is not None else None
connect_timeout = None
read_timeout = None
if t is not None:
if isinstance(t, httpx.Timeout):
if t.connect is not None:
connect_timeout = float(t.connect)
if t.read is not None:
read_timeout = float(t.read)
if connect_timeout is None and read_timeout is not None:
connect_timeout = read_timeout
if read_timeout is None and connect_timeout is not None:
read_timeout = connect_timeout
elif isinstance(t, (int, float)):
connect_timeout = float(t)
read_timeout = float(t)
if timeout is not None:
connect_timeout = timeout
read_timeout = timeout
t = httpx.Timeout(timeout)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks nearly identical to the code in SingleStoreChatFactory. Is there a reason they can't be combined into a single function that is called to do these operations instead of duplicating code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do this refactoring as part of cleaning up the code along with the deprecated parameters.

@kesmit13 kesmit13 merged commit 8fd5a41 into main Dec 1, 2025
11 checks passed
@kesmit13 kesmit13 deleted the MCDB-86327 branch December 1, 2025 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants