[Integration Proposal] Add RapidFire AI to TRL docs for concurrent multi-config training (16–24× throughput) #4351

kamran-rapidfireAI · 2025-10-28T00:34:53Z

kamran-rapidfireAI
Oct 28, 2025

👋 Introduction

Hi TRL maintainers and community!

We'd like to propose adding RapidFire AI to TRL's integrations documentation. RapidFire AI is a hyperparallelized experiment execution framework that significantly enhances the TRL training experience.

Before submitting a formal PR, we wanted to get feedback from the community and maintainers on this proposal.

🚀 What is RapidFire AI?

RapidFire AI is an open-source experiment execution framework that enables concurrent training of multiple TRL configurations on the same GPU(s) through intelligent chunk-based scheduling.

Key Benefits for TRL Users:

16-24× higher experimentation throughput compared to sequential training
Almost no code changes - drop-in configuration wrappers around TRL's and PEFT's existing configs
Interactive Control Operations - real-time control to stop, resume, clone, and modify training runs in flight
Automatic multi-GPU orchestration with intelligent scheduling
Full compatibility with transformers, PEFT, SFTTrainer, DPOTrainer, and GRPOTrainer

Production-Ready: Already used in production environments with complete working examples.

🎯 Why This Integration Matters

Problem It Solves

When fine-tuning or post-training with TRL, AI developers often need to:

Try different hyperparameter configurations
Compare different LoRA settings
Test different prompt schemes
Run ablation studies

Current approach: Train each config one after another → slow and inefficient process

With RapidFire AI: Train all configs in one go even on a single GPU → 16-24× faster process

How It Works

RapidFire AI employs adaptive chunk-based scheduling:

GPU Timeline (Single GPU):
Chunk 1: [Config A] → [Config B] → [Config C] → [Config D]
Chunk 2: [Config A] → [Config B] → [Config C] → [Config D]
Chunk 3: [Config A] → [Config B] → [Config C] → [Config D]

This enables:

Early comparison of configurations on same data subsets incrementally
Efficient GPU utilization and minimizing idle times
Real-time and automated experiment metrics tracking
Dynamic control over runs in flight to incentivize more experimentation

📝 Proposed Documentation

We have prepared comprehensive integration documentation that includes:

1. Quick Start Example

Complete working example showing how to train 4 SFT configurations in one go:

from rapidfireai import Experiment
from rapidfireai.automl import RFGridSearch, RFModelConfig, RFSFTConfig, RFLoraConfig

# Define multiple configs
config_set = List([
    RFModelConfig(
        model_name="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
        training_args=RFSFTConfig(learning_rate=1e-3, ...),  # Wraps TRL's SFTConfig
        peft_config=peft_configs,
    ),
    # ... more configs
])

# Run all configs concurrently
experiment = Experiment("sft-experiment")
experiment.run_fit(config_group, create_model, train_dataset, eval_dataset)

2. Coverage for Most Popular TRL Trainers

SFTTrainer via RFSFTConfig - Customer support Q&A chatbot use case
DPOTrainer via RFDPOConfig - Preference alignment use case
GRPOTrainer via RFGRPOConfig - Math reasoning use case

3. Advanced Features

PEFT and LoRA integration
Custom example preprocessing function
Custom evaluation metrics function
Custom reward functions (for GRPO)
Automated multi-GPU orchestration
Interactive Control Operations (IC Ops)
Coming soon: Automated model sharding with FSDP/DeepSpeed
Coming soon: Support for other viz dashboards (trackio, wandb)

4. Performance Benchmarks

Real measurements showing 16-24× higher experimentation throughput to reach similar accuracy across different scenarios.

5. Troubleshooting & Best Practices

Common issues, solutions, and optimization tips.

🔗 Resources

GitHub: https://github.com/RapidFireAI/rapidfireai
Documentation: https://oss-docs.rapidfire.ai
PyPI: https://pypi.org/project/rapidfireai/
Tutorial Notebooks:

🤔 Questions for the Community

Before submitting the PR, we'd love to get feedback on:

Do you think this integration is valuable to TRL users? The ability to quickly compare multiple configs in one go, even on limited GPUs, is potentially very useful for hyperparameter tuning, adapter tuning, prompt tuning, and ablation studies. Unlike full-blown task-paralle execution engines such as Weights & Biases or Ray Tune, RapidFire AI surfaces all results much sooner and offer full control over runs in flight.
Is the documentation approach appropriate? We have modeled the integration documentation after existing TRL integrations such as Unsloth, DeepSpeed, and vLLM, including with working example use case notebooks for all three TRL trainers.
What additional information would be helpful? Are there specific additional use cases, examples, or documentation that would make this integration more valuable to the TRL community?
Any concerns about the integration? We want to make sure that RapidFire AI complements and empowers the TRL user and developer community rather than adding needless complexity.

📊 Example Use Case

Here is a concrete scenario:

Goal: Fine-tune an open LLM for a customer support Q&A chatbot with SFT on private in-house data.

Traditional approach:

Try learning rate 1e-3 for 1 epoch: 60 minutes
Try learning rate 1e-4 for 1 epoch: 60 minutes
Try learning rate 5e-4 for 1 epoch: 60 minutes
Try learning rate 1e-5 for 1 epoch: 60 minutes
Total: 4 hours

One possible sequence with RapidFire AI:

Compare all 4 configs on 8 chunks in one go: 7.5 minutes per chunk per run
See real-time metrics on dashboard
Stop underperforming 3 configs after first chunk
Clone best config, modify lr or lr scheduler, add 2 new warm-started configs
Let new configs continue on second chunk: 7.5 minutes per chunk per run
Stop parent config and underperforming cloned config after second chunk
Let best cloned config continue on rest 6 chunks: 7.5 minutes per chunk
Total: 1.5 hours AND better accuracy reached than above

✅ What We Are Proposing

If the TRL community and maintainers agree the above is valuable, we'd like to:

Submit a PR adding docs/source/rapidfire_integration.md
Update docs/source/_toctree.yml to include RapidFire AI in the Integrations section
Follow the same format as existing integrations (Unsloth, DeepSpeed, etc.)

The documentation is already prepared and ready for review.

🙏 Looking Forward to Your Feedback

We believe this integration will significantly improve the TRL user experience for both AI researchers and practitioners customizing open LLMs on Hugging Face on their own data for their bespoke use cases. We'd like to ensure it aligns with TRL's vision and adds real value to the community.

Thanks for considering this proposal! 🚀

Note: We are ready to submit the PR with the documentation and iterate based on your feedback. We are also committed to keep maintaining this integration, commit new features RapidFire AI's functionality expands, and keep it up to date with new TRL releases.

kamran-rapidfireAI · 2025-10-28T00:41:56Z

kamran-rapidfireAI
Oct 28, 2025
Author

Hello @qgallouedec, on behalf of the RapidFire AI team.
Colleagues previously discussed a TRL integration; we’ve prepared concise docs and examples (SFT/DPO/GRPO) outlined above.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Integration Proposal] Add RapidFire AI to TRL docs for concurrent multi-config training (16–24× throughput) #4351

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Integration Proposal] Add RapidFire AI to TRL docs for concurrent multi-config training (16–24× throughput) #4351

Uh oh!

kamran-rapidfireAI Oct 28, 2025

👋 Introduction

🚀 What is RapidFire AI?

🎯 Why This Integration Matters

Problem It Solves

How It Works

📝 Proposed Documentation

1. Quick Start Example

2. Coverage for Most Popular TRL Trainers

3. Advanced Features

4. Performance Benchmarks

5. Troubleshooting & Best Practices

🔗 Resources

🤔 Questions for the Community

📊 Example Use Case

✅ What We Are Proposing

🙏 Looking Forward to Your Feedback

Replies: 1 comment

Uh oh!

kamran-rapidfireAI Oct 28, 2025 Author

kamran-rapidfireAI
Oct 28, 2025

kamran-rapidfireAI
Oct 28, 2025
Author