You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The file `modeling_opt.py` shows an example of how a custom model can be defined using TRT-LLM APIs without modifying the source code of TRT-LLM.
3
+
4
+
The file `main.py` shows how to run inference for such custom models using the LLM API.
5
+
6
+
7
+
## Out-of-tree Multimodal Models
8
+
9
+
For multimodal models, TRT-LLM provides `quickstart_multimodal.py` to quickly run a multimodal model that is defined within TRT-LLM. `trtllm-bench` can be used for benchmarking such models.
10
+
However, the following sections describe how to use those tools for out-of-tree models.
11
+
12
+
### Pre-requisite
13
+
To use an out-of-tree model with the quickstart example and trtllm-bench, you need to prepare the model definition files similar to a python module.
14
+
Consider the following file structure as an example:
15
+
```
16
+
modeling_custom_phi
17
+
|-- __init__.py
18
+
|-- configuration.py
19
+
|-- modeling_custom_phi.py
20
+
|-- encoder
21
+
|-- __init__.py
22
+
|-- configuration.py
23
+
|-- modeling_encoder.py
24
+
````
25
+
The files `__init__.py` should be populated with the right imports for the custom model. For example, the `modeling_custom_phi/__init__.py` can contain something like:
26
+
```
27
+
from .modeling_custom_phi import MyVLMForConditionalGeneration
28
+
from . import encoder
29
+
```
30
+
31
+
### Quickstart Example
32
+
33
+
Once the model definition files are prepared as a python module (as described above), you can use the `--custom_module_dirs` flag in `quickstart_multimodal.py` to load your model and run inference.
0 commit comments