We're excited to announce the first release for Optimum ExecuTorch!
Export Transformers models to run on supported ExecuTorch backends
Optimum ExecuTorch supports exporting and running Transformers models on ExecuTorch's selection of backends, including:
- XNNPack
- Cuda
- Core ML
- Metal
XNNPack is the most supported backend at the moment - all of the supported models listed below will have good performance on CPU as a result. Results may vary for the other backends.
Popular LLMs, multimodal, and more
We support the following models out of the box:
LLMs (Large Language Models)
Decoder-only
- Codegen: Salesforce's
codegen-350M-monoand its variants - Gemma:
Gemma-2band its variants - Gemma2:
Gemma-2-2band its variants - Gemma3:
Gemma-3-1band its variants (270M, 1B) - Glm:
glm-edge-1.5band its variants - Gpt2:
gpt-sw3-126mand its variants - GptJ:
gpt-j-405Mand its variants - GptNeoX: EleutherAI's
pythia-14mand its variants - GptNeoXJapanese:
gpt-neox-japanese-2.7band its variants - Granite:
granite-3.3-2b-instructand its variants - Llama:
Llama-3.2-1Band its variants - Mistral:
Ministral-3b-instructand its variants - Qwen2:
Qwen2.5-0.5Band its variants - Qwen3:
Qwen3-0.6B,Qwen3-Embedding-0.6Band other variants - Olmo:
OLMo-1B-hfand its variants - Phi:
JSL-MedPhi2-2.7Band its variants - Phi4:
Phi-4-mini-instructand its variants - Smollm: 🤗
SmolLM2-135Mand its variants - Smollm3: 🤗
SmolLM3-3Band its variants - Starcoder2:
starcoder2-3band its variants
Encoder-decoder (Seq2Seq)
- T5: Google's
T5and its variants
NLU (Natural Language Understanding)
- Albert:
albert-base-v2and its variants - Bert: Google's
bert-base-uncasedand its variants - Distilbert:
distilbert-base-uncasedand its variants - Eurobert:
EuroBERT-210mand its variants - Roberta: FacebookAI's
xlm-roberta-baseand its variants
Vision Models
- Cvt: Convolutional Vision Transformer
- Deit: Distilled Data-efficient Image Transformer (base-sized)
- Dit: Document Image Transformer (base-sized)
- EfficientNet: EfficientNet (b0-b7 sized)
- Focalnet: FocalNet (tiny-sized)
- Mobilevit: Apple's MobileViT xx-small
- Mobilevit2: Apple's MobileViTv2
- Pvt: Pyramid Vision Transformer (tiny-sized)
- Swin: Swin Transformer (tiny-sized)
Audio Models
ASR (Automatic Speech Recognition)
- Whisper: OpenAI's
Whisperand its variants
Speech text-to-text (Automatic Speech Recognition)
- Granite Speech:
granite-speech-3.3-2band its variants - Voxtral: Mistral's newest speech/text-to-text model
Contributors
@echarlaix @guangy10 @jackzhxng @tugsbayasgalan @chmjkb @kimishpatel @metascroy @manuelcandales @desertfire @digantdesai @larryliu0820