diff --git a/docs/source/index.md b/docs/source/index.md index 9aa1a0d79c..469033778e 100644 --- a/docs/source/index.md +++ b/docs/source/index.md @@ -7,6 +7,44 @@ TRL is a full stack library where we provide a set of tools to train transformer language models with methods like Supervised Fine-Tuning (SFT), Group Relative Policy Optimization (GRPO), Direct Preference Optimization (DPO), Reward Modeling, and more. The library is integrated with 🤗 [transformers](https://github.com/huggingface/transformers). +Below is the current list of TRL trainers, organized by method type (⚡️ = vLLM support). + +
+ +
+ +**Online methods** +- [`GRPOTrainer`] ⚡️ +- [`RLOOTrainer`] ⚡️ +- [`OnlineDPOTrainer`] ⚡️ +- [`NashMDTrainer`] ⚡️ +- [`XPOTrainer`] ⚡️ +- [`PPOTrainer`] + +**Reward modeling** +- [`PRMTrainer`] +- [`RewardTrainer`] + +
+ +
+ +**Offline methods** +- [`SFTTrainer`] +- [`DPOTrainer`] +- [`ORPOTrainer`] +- [`BCOTrainer`] +- [`CPOTrainer`] +- [`KTOTrainer`] + +**Knowledge distillation** +- [`GKDTrainer`] + +
+ +
+ + ## 🎉 What's New **✨ OpenAI GPT OSS Support**: TRL now fully supports fine-tuning the latest [OpenAI GPT OSS models](https://huggingface.co/collections/openai/gpt-oss-68911959590a1634ba11c7a4)! Check out the: