Skip to content

modelscope/Trinity-RFT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

中文主页 | Tutorial | FAQ

Trinity-RFT

Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models

paper doc pypi license

đź’ˇ What is Trinity-RFT?

Trinity-RFT is a flexible, general-purpose framework for reinforcement fine-tuning (RFT) of large language models (LLMs). It decouples the RFT process into three key components: Explorer, Trainer, and Buffer, and provides functionalities for users with different backgrounds and objectives:

  • 🤖 For agent application developers. [tutorial]

  • đź§  For RL algorithm researchers. [tutorial]

    • Design and validate new reinforcement learning algorithms using compact, plug-and-play modules.
    • Example: Mixture of SFT and GRPO
  • 📊 For data engineers. [tutorial]

    • Create task-specific datasets and build data pipelines for cleaning, augmentation, and human-in-the-loop scenarios.
    • Example: Data Processing

🌟 Key Features

  • Flexible RFT Modes:

    • Supports synchronous/asynchronous, on-policy/off-policy, and online/offline training. Rollout and training can run separately and scale independently across devices.
    RFT modes supported by Trinity-RFT
  • General Agentic-RL Support:

    • Supports both concatenated and general multi-turn agentic workflows. Able to directly train agent applications developed using agent frameworks like AgentScope.
    Agentic workflows
  • Full Lifecycle Data Pipelines:

    • Enables pipeline processing of rollout and experience data, supporting active management (prioritization, cleaning, augmentation) throughout the RFT lifecycle.
    Data pipeline design
  • User-Friendly Design:

    • Modular, decoupled architecture for easy adoption and development. Rich graphical user interfaces enable low-code usage.
    System architecture

🚀 News

  • [2025-09] ✨ [Release Notes] Trinity-RFT v0.3.0 released: enhanced Buffer, FSDP2 & Megatron support, multi-modal models, and new RL algorithms/examples.
  • [2025-08] 🎵 Introducing CHORD: dynamic SFT + RL integration for advanced LLM fine-tuning (paper).
  • [2025-08] [Release Notes] Trinity-RFT v0.2.1 released.
  • [2025-07] [Release Notes] Trinity-RFT v0.2.0 released.
  • [2025-07] Technical report (arXiv v2) updated with new features, examples, and experiments: link.
  • [2025-06] [Release Notes] Trinity-RFT v0.1.1 released.
  • [2025-05] [Release Notes] Trinity-RFT v0.1.0 released, plus technical report.
  • [2025-04] Trinity-RFT open sourced.

Table of contents

Quick Start

Note

This project is currently under active development. Comments and suggestions are welcome!

Step 1: installation

Before installing, make sure your system meets the following requirements:

  • Python: version 3.10 to 3.12 (inclusive)
  • CUDA: version 12.4 to 12.8 (inclusive)
  • GPUs: at least 2 GPUs

From Source (Recommended)

If you plan to customize or contribute to Trinity-RFT, this is the best option.

1. Clone the Repository
git clone https://github.com/modelscope/Trinity-RFT
cd Trinity-RFT
2. Set Up a Virtual Environment

Choose one of the following options:

Using Conda
conda create -n trinity python=3.10
conda activate trinity

pip install -e ".[dev]"
pip install -e ".[flash_attn]"
# if you encounter issues when installing flash-attn, try:
# pip install flash-attn==2.8.1 --no-build-isolation
Using venv
python3.10 -m venv .venv
source .venv/bin/activate

pip install -e ".[dev]"
pip install -e ".[flash_attn]"
# if you encounter issues when installing flash-attn, try:
# pip install flash-attn==2.8.1 --no-build-isolation
Using uv

uv is a modern Python package installer.

uv sync --extra dev --extra flash_attn

Via PyPI

If you just want to use the package without modifying the code:

pip install trinity-rft==0.3.0
pip install flash-attn==2.8.1

Or with uv:

uv pip install trinity-rft==0.3.0
uv pip install flash-attn==2.8.1

Using Docker

We provide a Docker setup for hassle-free environment configuration.

git clone https://github.com/modelscope/Trinity-RFT
cd Trinity-RFT

# Build the Docker image
## Tip: You can modify the Dockerfile to add mirrors or set API keys
docker build -f scripts/docker/Dockerfile -t trinity-rft:latest .

# Run the container, replacing <path_to_your_data_and_checkpoints> with your actual path
docker run -it \
  --gpus all \
  --shm-size="64g" \
  --rm \
  -v $PWD:/workspace \
  -v <path_to_your_data_and_checkpoints>:/data \
  trinity-rft:latest

For training with Megatron-LM, please refer to Megatron-LM Backend.

Step 2: prepare dataset and model

Trinity-RFT supports most datasets and models from Huggingface and ModelScope.

Prepare the model in the local directory $MODEL_PATH/{model_name}:

# Using Huggingface
huggingface-cli download {model_name} --local-dir $MODEL_PATH/{model_name}

# Using Modelscope
modelscope download {model_name} --local_dir $MODEL_PATH/{model_name}

For more details about model downloading, see Huggingface or ModelScope.

Prepare the dataset in the local directory $DATASET_PATH/{dataset_name}:

# Using Huggingface
huggingface-cli download {dataset_name} --repo-type dataset --local-dir $DATASET_PATH/{dataset_name}

# Using Modelscope
modelscope download --dataset {dataset_name} --local_dir $DATASET_PATH/{dataset_name}

For more details about dataset downloading, see Huggingface or ModelScope.

Step 3: configurations

Trinity-RFT provides a web interface for configuring your RFT process.

Note

This is an experimental feature, and we will continue to improve it.

To launch the web interface for minimal configurations, you can run

trinity studio --port 8080

Then you can configure your RFT process in the web page and generate a config file. You can save the config file for later use or run it directly as described in the following section.

Advanced users can also edit the config file directly. We provide example config files in examples.

For complete GUI features, please refer to the monorepo for Trinity-Studio.

Example: config manager GUI

config-manager

Step 4: run the RFT process

Start a ray cluster:

# On master node
ray start --head

# On worker nodes
ray start --address=<master_address>

(Optional) Log in to wandb for better monitoring:

export WANDB_API_KEY=<your_api_key>
wandb login

For command-line users, run the RFT process:

trinity run --config <config_path>

For example, below is the command for fine-tuning Qwen2.5-1.5B-Instruct on GSM8k with GRPO:

trinity run --config examples/grpo_gsm8k/gsm8k.yaml

For studio users, click "Run" in the web interface.

Further tutorials

Note

For more tutorials, please refer to the Trinity-RFT Documentation.

Tutorials for running different RFT modes:

Tutorials for adapting Trinity-RFT to multi-step agentic scenarios:

Tutorials for data-related functionalities:

Tutorials for RL algorithm development/research with Trinity-RFT:

Guidelines for full configurations:

Guidelines for developers and researchers:

Upcoming features

A tentative roadmap: #51

Contribution guide

This project is currently under active development, and we welcome contributions from the community!

See CONTRIBUTING.md for detailed contribution guidelines.

Acknowledgements

This project is built upon many excellent open-source projects, including:

Citation

@misc{trinity-rft,
      title={Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models},
      author={Xuchen Pan and Yanxi Chen and Yushuo Chen and Yuchang Sun and Daoyuan Chen and Wenhao Zhang and Yuexiang Xie and Yilun Huang and Yilei Zhang and Dawei Gao and Yaliang Li and Bolin Ding and Jingren Zhou},
      year={2025},
      eprint={2505.17826},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.17826},
}

About

Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM).

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Languages