Skip to content
This repository was archived by the owner on Sep 6, 2025. It is now read-only.

Commit 93e51b1

Browse files
committed
archive: everything not saved will be lost
1 parent 168d2a2 commit 93e51b1

File tree

2 files changed

+279
-274
lines changed

2 files changed

+279
-274
lines changed

README.md

Lines changed: 3 additions & 274 deletions
Original file line numberDiff line numberDiff line change
@@ -1,276 +1,5 @@
1-
# OpenRouter Runner
1+
This repo is an artifact from the early days of OpenRouter, when the opensource LLM provider ecosystem was much more nascent. Early adopters wanted to chat with niche fine-tunes, so we built this with vllm to serve them.
22

3-
OpenRouter Runner is a monolith inference engine, built with [Modal](https://modal.com/). It serves as a robust solution for the deployment of tons of open source models that are hosted in a fallback capacity on [openrouter.ai](https://openrouter.ai).
3+
No longer in use, but preserved here for historical purposes.
44

5-
> ✨ If you can make the Runner run faster and cheaper, we'll route to your services!
6-
7-
#### Table of Contents
8-
- [Adding Models To OpenRouter (Video)](#adding-models-to-openrouter)
9-
- [Prerequisites](#prerequisites)
10-
- [Quickstart](#quickstart)
11-
- [Adding New Models](#adding-new-models)
12-
- [Adding a New Model With Existing Containers](#adding-a-new-model-with-existing-containers)
13-
- [Adding a New Model With New Container](#adding-a-new-model-requiring-a-new-container)
14-
- [Configuration and Testing](#configuration-and-testing)
15-
- [Deploying](#deploying)
16-
- [Contributions](#contributions)
17-
18-
19-
# Adding Models To OpenRouter
20-
[![Watch the video](https://img.youtube.com/vi/Ob9xx44Gb_o/maxresdefault.jpg)](https://youtu.be/Ob9xx44Gb_o)
21-
22-
23-
# Prerequisites
24-
25-
Before you begin, ensure you have the necessary accounts and tools:
26-
27-
1. **Modal Account**: Set up your environment on [Modal](https://modal.com/) as this will be your primary deployment platform.
28-
2. **Hugging Face Account**: Obtain a token from [Hugging Face](https://huggingface.co/) for accessing models and libraries.
29-
3. **Poetry Installed**: Make sure you have [poetry](https://python-poetry.org/docs/) installed on your machine.
30-
31-
# Quickstart
32-
33-
For those familiar with the OpenRouter Runner and wanting to deploy it quickly. This means you have already set up the [prerequisites](#prerequisites) and can start deploying.
34-
35-
1. **Navigate to modal directory.**
36-
37-
```shell
38-
cd path/to/modal
39-
```
40-
41-
2. **Setup Poetry**
42-
43-
```sh
44-
poetry install
45-
poetry shell
46-
modal token new
47-
```
48-
49-
> ℹ️ For intellisense, it's recommended to run vscode via the poetry shell:
50-
51-
```sh
52-
poetry shell
53-
code .
54-
```
55-
56-
3. **Create dev environment**
57-
58-
```shell Python
59-
modal environment create dev
60-
```
61-
62-
> ℹ️ If you have a dev environment created already no need to create another one. Just configure to it in the next step.
63-
64-
4. **Configure dev environment**
65-
66-
```shell Python
67-
modal config set-environment dev
68-
```
69-
> ⚠️ We are using our Dev environment right now. Switch to **main** when deploying to production.
70-
71-
72-
5. **Configure secret keys**
73-
74-
- **HuggingFace Token**:
75-
Create a Modal secret group with your Hugging Face token. Replace `<your huggingface token>` with the actual token.
76-
```shell Python
77-
modal secret create huggingface HUGGINGFACE_TOKEN=<your huggingface token>
78-
```
79-
- **Runner API Key**:
80-
Create a Modal secret group for the runner API key. Replace `<generate a random key>` with a strong, random key you've generated. Be sure to save this key somewhere as we'll need it for later!
81-
```shell Python
82-
modal secret create ext-api-key RUNNER_API_KEY=<generate a random key>
83-
```
84-
85-
- **Sentry Configuration**
86-
Create a Modal secret group for the Sentry error tracking storage. Replace `<optional SENTRY_DSN>` with your DSN from sentry.io or leave it blank to disable Sentry (e.g. `SENTRY_DSN=`). You can also add an environment by adding `SENTRY_ENVIRONMENT=<environment name>` to the command.
87-
```shell Python
88-
modal secret create sentry SENTRY_DSN=<optional SENTRY_DSN>
89-
```
90-
91-
- **Datadog Configuration**
92-
Create a Modal secret group for Datadog log persistence. Replace `<optional DD_API_KEY>` with your Datadog API Key or leave it blank to disable Datadog (e.g. `DD_API_KEY=`). You can also add an environment by adding `DD_ENV=<environment name>` to the command and a site by adding `DD_SITE=<site name>` to the command.
93-
```shell Python
94-
modal secret create datadog DD_API_KEY=<optional DD_API_KEY> DD_SITE=<site name>
95-
```
96-
97-
6. **Download Models**
98-
99-
```shell Python
100-
modal run runner::download
101-
```
102-
103-
7. **Deploy Runner**
104-
105-
```shell Python
106-
modal deploy runner
107-
```
108-
109-
## Adding New Models
110-
111-
With your environment now fully configured, you're ready to dive into deploying the OpenRouter Runner. This section guides you through deploying the Runner, adding new models or containers, and initiating tests to ensure everything is functioning as expected.
112-
113-
### Adding a New Model with Existing Containers
114-
115-
Adding new models to OpenRouter Runner is straightforward, especially when using models from Hugging Face that are compatible with existing containers. Here's how to do it:
116-
117-
1. **Find and Copy the Model ID**: Browse [Hugging Face](https://huggingface.co/models) for the model you wish to deploy. For example, let's use `"mistralai/Mistral-7B-Instruct-v0.2"`.
118-
119-
2. **Update Model List**: Open the `runner/containers/__init__.py` file. Add your new model ID to the `DEFAULT_CONTAINER_TYPES` dictionary, using the container definition you want to use:
120-
```python
121-
DEFAULT_CONTAINER_TYPES = {
122-
"Intel/neural-chat-7b-v3-1": ContainerType.VllmContainer_7B,
123-
"mistralai/Mistral-7B-Instruct-v0.2": ContainerType.VllmContainer_7B,
124-
...
125-
}
126-
```
127-
128-
129-
3. **Handle Access Permissions**: If you plan to deploy a model like `"meta-llama/Llama-2-13b-chat-hf"`, and you don't yet have access, visit [here](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf) for instructions on how to request access. Temporarily, you can comment out this model in the list to proceed with deployment.
130-
131-
4. **Download and Prepare Models**: Use the CLI to execute the `runner::download` function within your application. This command is designed to download and prepare the required models for your containerized app.
132-
```shell Python
133-
modal run runner::download
134-
```
135-
This step does not deploy your app but ensures all necessary models are downloaded and ready for when you do deploy. After running this command, you can check the specified storage location or logs to confirm that the models have been successfully downloaded. Note that depending on the size and number of models, this process can take some time.
136-
137-
5. **Start testing the Models**: Now you can go to the [Configuration and Testing](#configuration-and-testing) section to start testing your models!
138-
139-
### Adding a New Model Requiring a New Container
140-
141-
Sometimes the model you want to deploy requires an environment or configurations that aren't supported by the existing containers. This might be due to special software requirements, different machine types, or other model-specific needs. In these cases, you'll need to create a new container.
142-
143-
1. **Understand the Requirements**: Before creating a new container, make sure you understand the specific requirements of your model. This might include special libraries, hardware needs, or environment settings.
144-
145-
2. **Copy a Container File**: Start by copying an existing container file from `runner/containers`. This gives you a template that's already integrated with the system.
146-
```shell
147-
cp runner/containers/existing_container.py runner/containers/new_container.py
148-
```
149-
150-
3. **Customize the Container**: Modify the new container file. Change the class name to something unique, and adjust the image, machine type, engine, and any other settings to meet your model's needs. Remember to install any additional libraries or tools required by your model.
151-
152-
4. **Register the Container**: Open `./containers/__init__.py`. Add an import statement for your new container class at the top of the file, then create a new list of model IDs or update an existing one to include your model.
153-
```python
154-
from .new_container import NewContainerClass
155-
156-
new_model_ids = [
157-
"your-model-id",
158-
# Add more model IDs as needed.
159-
]
160-
```
161-
162-
5. **Associate Models**: Add a `ContainerType` for your model in `modal/shared/protocol.py` and define how to build it in `get_container(model_path: Path, container_type: ContainerType)` in `modal/runner/containers/__init__.py`.
163-
164-
6. **Download and Prepare Models**: Use the CLI to execute the `runner::download` function within your application. This command is designed to download and prepare the required models for your containerized app.
165-
```shell Python
166-
modal run runner::download
167-
```
168-
This step does not deploy your app but ensures all necessary models are downloaded and ready for when you do deploy. After running this command, you can check the specified storage location or logs to confirm that the models have been successfully downloaded. Note that depending on the size and number of models, this process can take some time.
169-
170-
171-
7. **Start Testing**: With your new container deployed, proceed to the [Configuration and Testing](#configuration-and-testing) section to begin testing your model!
172-
173-
>[!NOTE]
174-
> Creating a new container can be complex and requires a good understanding of the model's needs and the system's capabilities. If you encounter difficulties, consult the detailed documentation, or seek support from the community or help forums.
175-
176-
## Configuration and Testing
177-
178-
Before diving into testing your models and endpoints, it's essential to properly configure your environment and install all necessary dependencies. This section guides you through setting up your environment, running test scripts, and ensuring everything is functioning correctly.
179-
180-
### Setting Up Your Environment
181-
182-
1. **Create a `.env.dev` File**: In the root of your project, create a `.env.dev` file to store your environment variables. This file should include:
183-
```plaintext
184-
API_URL=<MODAL_API_ENDPOINT_THAT_WAS_DEPLOYED>
185-
RUNNER_API_KEY=<CUSTOM_KEY_YOU_CREATED_EARLIER>
186-
MODEL=<MODEL_YOU_ADDED_OR_WANT_TO_TEST>
187-
```
188-
- `API_URL`: Your endpoint URL, obtained downloading the models. You can find this on your Modal dashboard as well.
189-
- `RUNNER_API_KEY`: The custom key you created earlier.
190-
- `MODEL`: The identifier of the model you wish to test.
191-
192-
2. **Install Dependencies**:
193-
If you haven't already install the following dependencies.
194-
195-
- If you're working with TypeScript scripts, you'll likely need to install Node.js packages. Use the appropriate package manager for your project:
196-
```shell
197-
npm install
198-
# or
199-
pnpm install
200-
```
201-
202-
### Running Your App for Testing
203-
204-
1. **Ensure the Runner is Active**: Make sure your OpenRouter Runner is running. From the `openrouter-runner/modal` directory, you can start it with:
205-
```shell Python
206-
modal serve runner
207-
```
208-
This command will keep your app running and ready for testing.
209-
210-
2. **Open Another Terminal for Testing**: While keeping the runner active, open a new terminal window. Navigate to the `/openrouter-runner` path to be in the correct directory for running scripts.
211-
212-
### Testing a Model
213-
214-
Now that your environment is set up and your app is running, you're ready to start testing models.
215-
216-
1. **Navigate to Project Root**: Ensure you're in the root directory of your project.
217-
```shell Python
218-
cd path/to/openrouter-runner
219-
```
220-
221-
2. **Load Environment Variables**: Source your `.env.dev` file to load the environment variables.
222-
```shell Python
223-
source .env.dev
224-
```
225-
226-
3. **Choose a Test Script**: In the `scripts` directory, you'll find various scripts for testing different aspects of your models. For a simple test, you might start with `test-simple.ts`.
227-
228-
4. **Run the Test Script**: Execute the script with your model identifier using the command below. Replace `YourModel/Identifier` with the specific model you want to test.
229-
```shell Python
230-
pnpm x scripts/test-simple.ts YourModel/Identifier
231-
```
232-
>[!NOTE]
233-
> If you wish to make the results more legible, especially for initial tests, consider setting `stream: false` in your script to turn off streaming.
234-
235-
5. **Viewing Results**: After running the script, you'll see a JSON-formatted output in your terminal. It will provide the generated text along with information on the number of tokens used in the prompt and completion. If you've set `stream: false`, the text will be displayed in its entirety, making it easier to review the model's output.
236-
237-
**Example Response**:
238-
```json
239-
{
240-
"text": "Project A119 was a top-secret program run by the United States government... U.S. nuclear and military policies.",
241-
"prompt_tokens": 23,
242-
"completion_tokens": 770,
243-
"done": true
244-
}
245-
```
246-
*Note: The response has been truncated for brevity.*
247-
248-
6. **Troubleshooting**: If you encounter errors related to Hugging Face models, ensure you've installed `huggingface_hub` and have the correct access permissions for the models you're trying to use.
249-
250-
By following these steps, you should be able to set up your environment, deploy your app, and start testing various models and endpoints. Remember to consult the detailed documentation and seek support from the community if you face any issues.
251-
252-
## Deploying
253-
254-
Deploying your model is the final step in making your AI capabilities accessible for live use. Here's how to deploy and what to expect:
255-
256-
1. **Deploy to Modal**:
257-
When you feel confident with your setup and testing, deploy your runner to Modal with the following command:
258-
```shell Python
259-
modal deploy runner
260-
```
261-
This command deploys your runner to Modal, packaging your configurations and models into a live, accessible application.
262-
263-
2. **View Your Deployment**:
264-
After deployment, visit your dashboard on [Modal](https://modal.com/). You should see your newly deployed model listed there. This dashboard provides useful information and controls for managing your deployment.
265-
266-
3. **Interact with Your Live Model**:
267-
With your model deployed, you can now call the endpoints live. Use the API URL provided in your `.env.dev` file (or found on your Modal dashboard) to send requests and receive AI-generated responses. This is where you see the real power of your OpenRouter Runner in action.
268-
269-
4. **Monitor and Troubleshoot**:
270-
Keep an eye on your application's performance and logs through the Modal dashboard. If you encounter any issues or unexpected behavior, consult the logs for insights and adjust your configuration as necessary.
271-
272-
By following these steps, your OpenRouter Runner will be live and ready to serve!
273-
274-
## Contributions
275-
276-
We'd love to see you add more models to the Runner! If you're interested in contributing, please follow the section on [Adding a New Model](#adding-new-models) to start adding more Open Source models to OpenRouter! In addition, please adhere to our [code of conduct](./CODE_OF_CONDUCT.md) to maintain a healthy and welcoming community.
5+
Check out the current models list [here](https://openrouter.ai/models), or head over to [our docs](https://openrouter.ai/docs/quickstart) to learn more about our modern features for growing startups and enterprises alike.

0 commit comments

Comments
 (0)