Oumi v0.4 Changelog
✨ gpt-oss Training and Inference
OpenAI released two highly-anticipated open-weight models in August, gpt-oss-20b
and gpt-oss-120b
. They’re mixture-of-experts (MoE) reasoning models with strong tool-use performance, and are optimized with native 4-bit quantization for memory-efficient training and inference. You can now run training and inference on these models in Oumi!
Usage Example:
# Train gpt-oss-20b with LoRA on a single GPU
oumi train -c oumi://configs/recipes/gpt_oss/sft/20b_lora_single_gpu_train.yaml
# Run local inference on gpt-oss-120b using vLLM
oumi infer -i -c oumi://configs/recipes/gpt_oss/inference/120b_vllm_infer.yaml
⚡ DeepSpeed Support
DeepSpeed is a powerful and configurable optimization library that allows you to train large models efficiently using techniques like distributed training and memory optimization. Oumi now supports DeepSpeed in addition to PyTorch’s native Fully Sharded Data Parallel (FSDP) for distributed training!
Usage Example:
# Train Llama 3.1 8B using DeepSpeed’s ZeRO-3 optimization strategy
oumi train -c oumi://configs/examples/deepspeed/llama3_1_8b_deepspeed_z3_train.yaml
# Combine DeepSpeed with YARN RoPE scaling to enable training on longer contexts!
# Train Qwen2.5 7B with 128k token context length using YARN and DeepSpeed
oumi train -c oumi://configs/projects/limo/qwen2.5_7b_fft_yarn_deepspeed.yaml
🗄️ CLI Tool for Hugging Face Cache Management
When using datasets and models from Hugging Face Hub, over time it becomes hard to track what’s been cached, how much space it’s taking up, etc. In #1897, @aniruddh-alt has added a oumi cache
utility to the Oumi CLI. This lets you view, add to, and delete from the Hugging Face Hub local cache, in addition to getting more information about entries in the cache.
Usage Example:
# View what’s in the cache
oumi cache ls
# Filter for items containing the substring “llama”, and sort by name
oumi cache ls -f *llama* --sort name
# Download a model to cache
oumi cache get Qwen/Qwen3-0.6B
# View information about the cached model
oumi cache card Qwen/Qwen3-0.6B
# Remove a model from cache
oumi cache rm Qwen/Qwen3-0.6B
🎯 Vision DPO and KTO Support
We have added support for two new training methods: Direct Preference Optimization (DPO) on Vision-Language Models and Kahneman-Tversky Optimization (KTO). Special thanks to @efsiatras for implementing KTO support in #1538!
Usage Example:
# Vision DPO on Qwen2.5-VL 3B
oumi train -c oumi://configs/recipes/vision/qwen2_5_vl_3b/dpo/train.yaml
# KTO on Phi-3
oumi train -c oumi://configs/recipes/phi3/kto/train.yaml
🛠️ Developer Experience
- Upgrade several package dependencies to latest versions
- Additional GGUF, MacOS LlamaCPP, and remote frontier model inference configs by @penfever in #1923 and #1947
- Add Pre-Populated GitHub Issue Link On Failures by @rlehman221 in #1936
- Add Verbose Flag to Oumi Train by @rlehman221 in #1940
- Enable users to log data samples during training for debugging by @shanghongsim in #1943
New Contributors
- @efsiatras made their first contribution in #1538
- @rlehman221 made their first contribution in #1936
All Contributors
@aniruddh-alt, @efsiatras, @jgreer013, @kaisopos, @oelachqar, @penfever, @rlehman221, @ryan-arman, @shanghongsim, @stefanwebb, @taenin, @wizeng23
Full Changelog: v0.3.0...v0.4.0