Skip to content

bug: issue getting started with OpenLLM #1186

@iamobservable

Description

@iamobservable

Describe the bug

Found OpenLLM recently while researching abstractions for local model execution and abstraction. The OpenLLM repo stood out. I may have initially misunderstood the options for using the library. After reading the README for the repository, being able to use an open source model via docker was enticing. Although, after reading the description for the repository I am left wondering if it is possible to use the library without some form of bentoml.

Questions:

1. Did I misunderstand the instructions or have a misunderstanding of the tool❓
2. Does OpenLLM require some form of bentoml cloud❓

What I found when following the Getting Started section of the documentation is, the execution does not give the same output as listed. Details about the execution can be found in the to reproduce section of this issue.

To reproduce

Steps followed:

  1. Clean ~/.openllm configuration folder
  2. Create new folder for project
  3. Initialize folder with uv with python 3.12.8
  4. Add openllm to project
  5. Execute openllm hello
  6. Output describes updating default and nightly, but does not seem to continue as expected

Steps


1. Clean ~/.openllm configuration folder

rm -rf ~/.openllm

2 Create new folder for project

# ~/local-gpt
mkdir openllm-api
cd openllm-api

3. Initialize folder with uv with python 3.12.8

# ~/local-gpt/openllm-api
uv init -p 3.12

Initialized project openllm-api

4. Add openllm to project

# ~/local-gpt/openllm-api
uv add openllm

Using CPython 3.12.8
Creating virtual environment at: .venv
Resolved 98 packages in 1.36s
Prepared 94 packages in 16.38s
Installed 94 packages in 244ms

  • a2wsgi==1.10.8
  • aiohappyeyeballs==2.6.1
  • aiohttp==3.11.18
  • aiosignal==1.3.2
  • aiosqlite==0.21.0
  • annotated-types==0.7.0
  • anyio==4.9.0
  • appdirs==1.4.4
  • asgiref==3.8.1
  • attrs==25.3.0
  • bentoml==1.4.8
  • cattrs==23.1.2
  • certifi==2025.4.26
  • charset-normalizer==3.4.2
  • click==8.1.8
  • click-option-group==0.5.7
  • cloudpickle==3.1.1
  • deprecated==1.2.18
  • distro==1.9.0
  • dulwich==0.22.8
  • filelock==3.18.0
  • frozenlist==1.6.0
  • fs==2.4.16
  • fsspec==2025.3.2
  • h11==0.16.0
  • hf-xet==1.1.0
  • httpcore==1.0.9
  • httpx==0.28.1
  • httpx-ws==0.7.2
  • huggingface-hub==0.30.2
  • idna==3.10
  • importlib-metadata==8.6.1
  • jinja2==3.1.6
  • jiter==0.9.0
  • kantoku==0.18.3
  • markdown-it-py==3.0.0
  • markupsafe==3.0.2
  • mdurl==0.1.2
  • multidict==6.4.3
  • numpy==2.2.5
  • nvidia-ml-py==12.570.86
  • openai==1.73.0
  • openllm==0.6.30
  • opentelemetry-api==1.32.1
  • opentelemetry-instrumentation==0.53b1
  • opentelemetry-instrumentation-aiohttp-client==0.53b1
  • opentelemetry-instrumentation-asgi==0.53b1
  • opentelemetry-sdk==1.32.1
  • opentelemetry-semantic-conventions==0.53b1
  • opentelemetry-util-http==0.53b1
  • packaging==25.0
  • pathspec==0.12.1
  • pip-requirements-parser==32.0.1
  • prometheus-client==0.21.1
  • prompt-toolkit==3.0.51
  • propcache==0.3.1
  • psutil==7.0.0
  • pyaml==25.1.0
  • pydantic==2.11.4
  • pydantic-core==2.33.2
  • pygments==2.19.1
  • pyparsing==3.2.3
  • python-dateutil==2.9.0.post0
  • python-dotenv==1.1.0
  • python-json-logger==3.3.0
  • python-multipart==0.0.20
  • pyyaml==6.0.2
  • pyzmq==26.4.0
  • questionary==2.1.0
  • requests==2.32.3
  • rich==14.0.0
  • schema==0.7.7
  • setuptools==80.3.0
  • shellingham==1.5.4
  • simple-di==0.1.5
  • six==1.17.0
  • sniffio==1.3.1
  • starlette==0.46.2
  • tabulate==0.9.0
  • tomli-w==1.2.0
  • tornado==6.4.2
  • tqdm==4.67.1
  • typer==0.15.3
  • typing-extensions==4.13.2
  • typing-inspection==0.4.0
  • urllib3==2.4.0
  • uv==0.7.2
  • uvicorn==0.34.2
  • watchfiles==1.0.5
  • wcwidth==0.2.13
  • wrapt==1.17.2
  • wsproto==1.2.0
  • yarl==1.20.0
  • zipp==3.21.0

5. Execute openllm hello

# ~/local-gpt/openllm-api
uv run openllm hello

updating repo default
updating repo nightly

Logs

Output provided above in "to reproduce" section

Environment

Using local environment, not cloud. All model execution is directly through ollama, but am interrested in llama.cpp as well. Does openLLM require bentoml?

Platform: ollama

System information (Optional)

Enviornment


platform: Windows 11 / WSL2 / Docker Compose (all execution done via WSL2)
memory: 128GB Ram
gpu: RTX 3080
Shell: bash
Package Manager: uv
Python Version: CPython 3.12.8

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions