|
1 | | -# OpenRouter Runner |
| 1 | +This repo is an artifact from the early days of OpenRouter, when the opensource LLM provider ecosystem was much more nascent. Early adopters wanted to chat with niche fine-tunes, so we built this with vllm to serve them. |
2 | 2 |
|
3 | | -OpenRouter Runner is a monolith inference engine, built with [Modal](https://modal.com/). It serves as a robust solution for the deployment of tons of open source models that are hosted in a fallback capacity on [openrouter.ai](https://openrouter.ai). |
| 3 | +No longer in use, but preserved here for historical purposes. |
4 | 4 |
|
5 | | -> ✨ If you can make the Runner run faster and cheaper, we'll route to your services! |
6 | | -
|
7 | | -#### Table of Contents |
8 | | -- [Adding Models To OpenRouter (Video)](#adding-models-to-openrouter) |
9 | | -- [Prerequisites](#prerequisites) |
10 | | -- [Quickstart](#quickstart) |
11 | | -- [Adding New Models](#adding-new-models) |
12 | | - - [Adding a New Model With Existing Containers](#adding-a-new-model-with-existing-containers) |
13 | | - - [Adding a New Model With New Container](#adding-a-new-model-requiring-a-new-container) |
14 | | -- [Configuration and Testing](#configuration-and-testing) |
15 | | -- [Deploying](#deploying) |
16 | | -- [Contributions](#contributions) |
17 | | - |
18 | | - |
19 | | -# Adding Models To OpenRouter |
20 | | -[](https://youtu.be/Ob9xx44Gb_o) |
21 | | - |
22 | | - |
23 | | -# Prerequisites |
24 | | - |
25 | | -Before you begin, ensure you have the necessary accounts and tools: |
26 | | - |
27 | | -1. **Modal Account**: Set up your environment on [Modal](https://modal.com/) as this will be your primary deployment platform. |
28 | | -2. **Hugging Face Account**: Obtain a token from [Hugging Face](https://huggingface.co/) for accessing models and libraries. |
29 | | -3. **Poetry Installed**: Make sure you have [poetry](https://python-poetry.org/docs/) installed on your machine. |
30 | | - |
31 | | -# Quickstart |
32 | | - |
33 | | -For those familiar with the OpenRouter Runner and wanting to deploy it quickly. This means you have already set up the [prerequisites](#prerequisites) and can start deploying. |
34 | | - |
35 | | -1. **Navigate to modal directory.** |
36 | | - |
37 | | - ```shell |
38 | | - cd path/to/modal |
39 | | - ``` |
40 | | - |
41 | | -2. **Setup Poetry** |
42 | | - |
43 | | - ```sh |
44 | | - poetry install |
45 | | - poetry shell |
46 | | - modal token new |
47 | | - ``` |
48 | | - |
49 | | - > ℹ️ For intellisense, it's recommended to run vscode via the poetry shell: |
50 | | -
|
51 | | - ```sh |
52 | | - poetry shell |
53 | | - code . |
54 | | - ``` |
55 | | -
|
56 | | -3. **Create dev environment** |
57 | | - |
58 | | - ```shell Python |
59 | | - modal environment create dev |
60 | | - ``` |
61 | | -
|
62 | | - > ℹ️ If you have a dev environment created already no need to create another one. Just configure to it in the next step. |
63 | | -
|
64 | | -4. **Configure dev environment** |
65 | | -
|
66 | | - ```shell Python |
67 | | - modal config set-environment dev |
68 | | - ``` |
69 | | - > ⚠️ We are using our Dev environment right now. Switch to **main** when deploying to production. |
70 | | -
|
71 | | -
|
72 | | -5. **Configure secret keys** |
73 | | -
|
74 | | - - **HuggingFace Token**: |
75 | | - Create a Modal secret group with your Hugging Face token. Replace `<your huggingface token>` with the actual token. |
76 | | - ```shell Python |
77 | | - modal secret create huggingface HUGGINGFACE_TOKEN=<your huggingface token> |
78 | | - ``` |
79 | | - - **Runner API Key**: |
80 | | - Create a Modal secret group for the runner API key. Replace `<generate a random key>` with a strong, random key you've generated. Be sure to save this key somewhere as we'll need it for later! |
81 | | - ```shell Python |
82 | | - modal secret create ext-api-key RUNNER_API_KEY=<generate a random key> |
83 | | - ``` |
84 | | - |
85 | | - - **Sentry Configuration** |
86 | | - Create a Modal secret group for the Sentry error tracking storage. Replace `<optional SENTRY_DSN>` with your DSN from sentry.io or leave it blank to disable Sentry (e.g. `SENTRY_DSN=`). You can also add an environment by adding `SENTRY_ENVIRONMENT=<environment name>` to the command. |
87 | | - ```shell Python |
88 | | - modal secret create sentry SENTRY_DSN=<optional SENTRY_DSN> |
89 | | - ``` |
90 | | - |
91 | | - - **Datadog Configuration** |
92 | | - Create a Modal secret group for Datadog log persistence. Replace `<optional DD_API_KEY>` with your Datadog API Key or leave it blank to disable Datadog (e.g. `DD_API_KEY=`). You can also add an environment by adding `DD_ENV=<environment name>` to the command and a site by adding `DD_SITE=<site name>` to the command. |
93 | | - ```shell Python |
94 | | - modal secret create datadog DD_API_KEY=<optional DD_API_KEY> DD_SITE=<site name> |
95 | | - ``` |
96 | | - |
97 | | - 6. **Download Models** |
98 | | - |
99 | | - ```shell Python |
100 | | - modal run runner::download |
101 | | - ``` |
102 | | -
|
103 | | - 7. **Deploy Runner** |
104 | | -
|
105 | | - ```shell Python |
106 | | - modal deploy runner |
107 | | - ``` |
108 | | -
|
109 | | -## Adding New Models |
110 | | -
|
111 | | -With your environment now fully configured, you're ready to dive into deploying the OpenRouter Runner. This section guides you through deploying the Runner, adding new models or containers, and initiating tests to ensure everything is functioning as expected. |
112 | | - |
113 | | -### Adding a New Model with Existing Containers |
114 | | - |
115 | | -Adding new models to OpenRouter Runner is straightforward, especially when using models from Hugging Face that are compatible with existing containers. Here's how to do it: |
116 | | -
|
117 | | -1. **Find and Copy the Model ID**: Browse [Hugging Face](https://huggingface.co/models) for the model you wish to deploy. For example, let's use `"mistralai/Mistral-7B-Instruct-v0.2"`. |
118 | | - |
119 | | -2. **Update Model List**: Open the `runner/containers/__init__.py` file. Add your new model ID to the `DEFAULT_CONTAINER_TYPES` dictionary, using the container definition you want to use: |
120 | | - ```python |
121 | | - DEFAULT_CONTAINER_TYPES = { |
122 | | - "Intel/neural-chat-7b-v3-1": ContainerType.VllmContainer_7B, |
123 | | - "mistralai/Mistral-7B-Instruct-v0.2": ContainerType.VllmContainer_7B, |
124 | | - ... |
125 | | - } |
126 | | - ``` |
127 | | - |
128 | | - |
129 | | -3. **Handle Access Permissions**: If you plan to deploy a model like `"meta-llama/Llama-2-13b-chat-hf"`, and you don't yet have access, visit [here](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf) for instructions on how to request access. Temporarily, you can comment out this model in the list to proceed with deployment. |
130 | | -
|
131 | | -4. **Download and Prepare Models**: Use the CLI to execute the `runner::download` function within your application. This command is designed to download and prepare the required models for your containerized app. |
132 | | - ```shell Python |
133 | | - modal run runner::download |
134 | | - ``` |
135 | | - This step does not deploy your app but ensures all necessary models are downloaded and ready for when you do deploy. After running this command, you can check the specified storage location or logs to confirm that the models have been successfully downloaded. Note that depending on the size and number of models, this process can take some time. |
136 | | -
|
137 | | -5. **Start testing the Models**: Now you can go to the [Configuration and Testing](#configuration-and-testing) section to start testing your models! |
138 | | -
|
139 | | -### Adding a New Model Requiring a New Container |
140 | | -
|
141 | | -Sometimes the model you want to deploy requires an environment or configurations that aren't supported by the existing containers. This might be due to special software requirements, different machine types, or other model-specific needs. In these cases, you'll need to create a new container. |
142 | | -
|
143 | | -1. **Understand the Requirements**: Before creating a new container, make sure you understand the specific requirements of your model. This might include special libraries, hardware needs, or environment settings. |
144 | | -
|
145 | | -2. **Copy a Container File**: Start by copying an existing container file from `runner/containers`. This gives you a template that's already integrated with the system. |
146 | | - ```shell |
147 | | - cp runner/containers/existing_container.py runner/containers/new_container.py |
148 | | - ``` |
149 | | - |
150 | | -3. **Customize the Container**: Modify the new container file. Change the class name to something unique, and adjust the image, machine type, engine, and any other settings to meet your model's needs. Remember to install any additional libraries or tools required by your model. |
151 | | -
|
152 | | -4. **Register the Container**: Open `./containers/__init__.py`. Add an import statement for your new container class at the top of the file, then create a new list of model IDs or update an existing one to include your model. |
153 | | - ```python |
154 | | - from .new_container import NewContainerClass |
155 | | -
|
156 | | - new_model_ids = [ |
157 | | - "your-model-id", |
158 | | - # Add more model IDs as needed. |
159 | | - ] |
160 | | - ``` |
161 | | -
|
162 | | -5. **Associate Models**: Add a `ContainerType` for your model in `modal/shared/protocol.py` and define how to build it in `get_container(model_path: Path, container_type: ContainerType)` in `modal/runner/containers/__init__.py`. |
163 | | -
|
164 | | -6. **Download and Prepare Models**: Use the CLI to execute the `runner::download` function within your application. This command is designed to download and prepare the required models for your containerized app. |
165 | | - ```shell Python |
166 | | - modal run runner::download |
167 | | - ``` |
168 | | - This step does not deploy your app but ensures all necessary models are downloaded and ready for when you do deploy. After running this command, you can check the specified storage location or logs to confirm that the models have been successfully downloaded. Note that depending on the size and number of models, this process can take some time. |
169 | | -
|
170 | | -
|
171 | | -7. **Start Testing**: With your new container deployed, proceed to the [Configuration and Testing](#configuration-and-testing) section to begin testing your model! |
172 | | -
|
173 | | ->[!NOTE] |
174 | | -> Creating a new container can be complex and requires a good understanding of the model's needs and the system's capabilities. If you encounter difficulties, consult the detailed documentation, or seek support from the community or help forums. |
175 | | -
|
176 | | -## Configuration and Testing |
177 | | -
|
178 | | -Before diving into testing your models and endpoints, it's essential to properly configure your environment and install all necessary dependencies. This section guides you through setting up your environment, running test scripts, and ensuring everything is functioning correctly. |
179 | | - |
180 | | -### Setting Up Your Environment |
181 | | - |
182 | | -1. **Create a `.env.dev` File**: In the root of your project, create a `.env.dev` file to store your environment variables. This file should include: |
183 | | - ```plaintext |
184 | | - API_URL=<MODAL_API_ENDPOINT_THAT_WAS_DEPLOYED> |
185 | | - RUNNER_API_KEY=<CUSTOM_KEY_YOU_CREATED_EARLIER> |
186 | | - MODEL=<MODEL_YOU_ADDED_OR_WANT_TO_TEST> |
187 | | - ``` |
188 | | - - `API_URL`: Your endpoint URL, obtained downloading the models. You can find this on your Modal dashboard as well. |
189 | | - - `RUNNER_API_KEY`: The custom key you created earlier. |
190 | | - - `MODEL`: The identifier of the model you wish to test. |
191 | | - |
192 | | -2. **Install Dependencies**: |
193 | | -If you haven't already install the following dependencies. |
194 | | -
|
195 | | - - If you're working with TypeScript scripts, you'll likely need to install Node.js packages. Use the appropriate package manager for your project: |
196 | | - ```shell |
197 | | - npm install |
198 | | - # or |
199 | | - pnpm install |
200 | | - ``` |
201 | | -
|
202 | | -### Running Your App for Testing |
203 | | -
|
204 | | -1. **Ensure the Runner is Active**: Make sure your OpenRouter Runner is running. From the `openrouter-runner/modal` directory, you can start it with: |
205 | | - ```shell Python |
206 | | - modal serve runner |
207 | | - ``` |
208 | | - This command will keep your app running and ready for testing. |
209 | | -
|
210 | | -2. **Open Another Terminal for Testing**: While keeping the runner active, open a new terminal window. Navigate to the `/openrouter-runner` path to be in the correct directory for running scripts. |
211 | | -
|
212 | | -### Testing a Model |
213 | | -
|
214 | | -Now that your environment is set up and your app is running, you're ready to start testing models. |
215 | | - |
216 | | -1. **Navigate to Project Root**: Ensure you're in the root directory of your project. |
217 | | - ```shell Python |
218 | | - cd path/to/openrouter-runner |
219 | | - ``` |
220 | | -
|
221 | | -2. **Load Environment Variables**: Source your `.env.dev` file to load the environment variables. |
222 | | - ```shell Python |
223 | | - source .env.dev |
224 | | - ``` |
225 | | -
|
226 | | -3. **Choose a Test Script**: In the `scripts` directory, you'll find various scripts for testing different aspects of your models. For a simple test, you might start with `test-simple.ts`. |
227 | | - |
228 | | -4. **Run the Test Script**: Execute the script with your model identifier using the command below. Replace `YourModel/Identifier` with the specific model you want to test. |
229 | | - ```shell Python |
230 | | - pnpm x scripts/test-simple.ts YourModel/Identifier |
231 | | - ``` |
232 | | ->[!NOTE] |
233 | | -> If you wish to make the results more legible, especially for initial tests, consider setting `stream: false` in your script to turn off streaming. |
234 | | - |
235 | | -5. **Viewing Results**: After running the script, you'll see a JSON-formatted output in your terminal. It will provide the generated text along with information on the number of tokens used in the prompt and completion. If you've set `stream: false`, the text will be displayed in its entirety, making it easier to review the model's output. |
236 | | -
|
237 | | - **Example Response**: |
238 | | - ```json |
239 | | - { |
240 | | - "text": "Project A119 was a top-secret program run by the United States government... U.S. nuclear and military policies.", |
241 | | - "prompt_tokens": 23, |
242 | | - "completion_tokens": 770, |
243 | | - "done": true |
244 | | - } |
245 | | - ``` |
246 | | - *Note: The response has been truncated for brevity.* |
247 | | -
|
248 | | -6. **Troubleshooting**: If you encounter errors related to Hugging Face models, ensure you've installed `huggingface_hub` and have the correct access permissions for the models you're trying to use. |
249 | | -
|
250 | | -By following these steps, you should be able to set up your environment, deploy your app, and start testing various models and endpoints. Remember to consult the detailed documentation and seek support from the community if you face any issues. |
251 | | -
|
252 | | -## Deploying |
253 | | -
|
254 | | -Deploying your model is the final step in making your AI capabilities accessible for live use. Here's how to deploy and what to expect: |
255 | | - |
256 | | -1. **Deploy to Modal**: |
257 | | - When you feel confident with your setup and testing, deploy your runner to Modal with the following command: |
258 | | - ```shell Python |
259 | | - modal deploy runner |
260 | | - ``` |
261 | | - This command deploys your runner to Modal, packaging your configurations and models into a live, accessible application. |
262 | | - |
263 | | -2. **View Your Deployment**: |
264 | | - After deployment, visit your dashboard on [Modal](https://modal.com/). You should see your newly deployed model listed there. This dashboard provides useful information and controls for managing your deployment. |
265 | | - |
266 | | -3. **Interact with Your Live Model**: |
267 | | - With your model deployed, you can now call the endpoints live. Use the API URL provided in your `.env.dev` file (or found on your Modal dashboard) to send requests and receive AI-generated responses. This is where you see the real power of your OpenRouter Runner in action. |
268 | | - |
269 | | -4. **Monitor and Troubleshoot**: |
270 | | - Keep an eye on your application's performance and logs through the Modal dashboard. If you encounter any issues or unexpected behavior, consult the logs for insights and adjust your configuration as necessary. |
271 | | -
|
272 | | -By following these steps, your OpenRouter Runner will be live and ready to serve! |
273 | | -
|
274 | | -## Contributions |
275 | | -
|
276 | | -We'd love to see you add more models to the Runner! If you're interested in contributing, please follow the section on [Adding a New Model](#adding-new-models) to start adding more Open Source models to OpenRouter! In addition, please adhere to our [code of conduct](./CODE_OF_CONDUCT.md) to maintain a healthy and welcoming community. |
| 5 | +Check out the current models list [here](https://openrouter.ai/models), or head over to [our docs](https://openrouter.ai/docs/quickstart) to learn more about our modern features for growing startups and enterprises alike. |
0 commit comments