Skip to content

v0.1.0

Latest

Choose a tag to compare

@jackzhxng jackzhxng released this 05 Nov 20:42
· 3 commits to main since this release

We're excited to announce the first release for Optimum ExecuTorch!

Export Transformers models to run on supported ExecuTorch backends

Optimum ExecuTorch supports exporting and running Transformers models on ExecuTorch's selection of backends, including:

  • XNNPack
  • Cuda
  • Core ML
  • Metal

XNNPack is the most supported backend at the moment - all of the supported models listed below will have good performance on CPU as a result. Results may vary for the other backends.

Popular LLMs, multimodal, and more

We support the following models out of the box:

LLMs (Large Language Models)

Decoder-only
  • Codegen: Salesforce's codegen-350M-mono and its variants
  • Gemma: Gemma-2b and its variants
  • Gemma2: Gemma-2-2b and its variants
  • Gemma3: Gemma-3-1b and its variants (270M, 1B)
  • Glm: glm-edge-1.5b and its variants
  • Gpt2: gpt-sw3-126m and its variants
  • GptJ: gpt-j-405M and its variants
  • GptNeoX: EleutherAI's pythia-14m and its variants
  • GptNeoXJapanese: gpt-neox-japanese-2.7b and its variants
  • Granite: granite-3.3-2b-instruct and its variants
  • Llama: Llama-3.2-1B and its variants
  • Mistral: Ministral-3b-instruct and its variants
  • Qwen2: Qwen2.5-0.5B and its variants
  • Qwen3: Qwen3-0.6B, Qwen3-Embedding-0.6B and other variants
  • Olmo: OLMo-1B-hf and its variants
  • Phi: JSL-MedPhi2-2.7B and its variants
  • Phi4: Phi-4-mini-instruct and its variants
  • Smollm: 🤗 SmolLM2-135M and its variants
  • Smollm3: 🤗 SmolLM3-3B and its variants
  • Starcoder2: starcoder2-3b and its variants
Encoder-decoder (Seq2Seq)
  • T5: Google's T5 and its variants

NLU (Natural Language Understanding)

  • Albert: albert-base-v2 and its variants
  • Bert: Google's bert-base-uncased and its variants
  • Distilbert: distilbert-base-uncased and its variants
  • Eurobert: EuroBERT-210m and its variants
  • Roberta: FacebookAI's xlm-roberta-base and its variants

Vision Models

  • Cvt: Convolutional Vision Transformer
  • Deit: Distilled Data-efficient Image Transformer (base-sized)
  • Dit: Document Image Transformer (base-sized)
  • EfficientNet: EfficientNet (b0-b7 sized)
  • Focalnet: FocalNet (tiny-sized)
  • Mobilevit: Apple's MobileViT xx-small
  • Mobilevit2: Apple's MobileViTv2
  • Pvt: Pyramid Vision Transformer (tiny-sized)
  • Swin: Swin Transformer (tiny-sized)

Audio Models

ASR (Automatic Speech Recognition)

  • Whisper: OpenAI's Whisper and its variants

Speech text-to-text (Automatic Speech Recognition)

  • Granite Speech: granite-speech-3.3-2b and its variants
  • Voxtral: Mistral's newest speech/text-to-text model

Contributors

@echarlaix @guangy10 @jackzhxng @tugsbayasgalan @chmjkb @kimishpatel @metascroy @manuelcandales @desertfire @digantdesai @larryliu0820