Can't use vLLM endpoint

Hi. I'm trying to use AI tookit in VSCode. I can't figure out how to use my local models. I'm using vllm, not ollama, locally. My config is pretty straight forward.
``` 
version: v0.1
providers:
  - name: Custom
    models:
      - name: gpt-oss-20b
        chat_completion_url: https://example.com/v1
        headers:
          Authorization: '{"Authorization: Bearer example_token"}'
        api_model_name: gpt-oss-20b
```
When I call this in the playground, I see the following error:
```
errorObject = {"innerError":{"status":404,"headers":{},"requestID":null}} 
2025-10-14 09:19:34.973 [error] Unable to call the gpt-oss-20b inference endpoint due to 404.  Please check if the input or configuration is correct. 404 status code (no body) 
```
This model works and is accessible, I just can't figure out how to hook it up to AI toolkit. Any ideas would be helpful. Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can't use vLLM endpoint #285

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can't use vLLM endpoint #285

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions