Add trainers taxonomy to docs #4195

sergiopaniego · 2025-10-02T10:39:35Z

What does this PR do?

Add the trainers taxonomy to the documentation, including details on inheritance and online support.
I've added the diagram here and if approved, I'll move it to documentation-images before merging.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

@qgallouedec

HuggingFaceDocBuilderDev · 2025-10-02T10:43:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2025-10-02T13:08:04Z

OnlineDPO doesn't inherit from DOO

sergiopaniego · 2025-10-02T13:45:59Z

lol updated!

albertvillanova

Nice!!

I would just raise one concern about maintainability: every time we add/rename/remove a trainer, this image would need to be regenerated and updated accordingly.

What do you think?

sergiopaniego · 2025-10-03T12:59:32Z

I would just raise one concern about maintainability: every time we add/rename/remove a trainer, this image would need to be regenerated and updated accordingly.

I understand your concern and am open to your thoughts. I don't think the addition of trainers happens frequently enough to be an issue. My intention in adding this taxonomy was to make it easier for beginners to understand the dependencies without having to navigate the code. and showcase online support via vLLM. Perhaps a simple list like the following could work as well:

(⚡ = online support via vLLM)

BCOTrainer
CPOTrainer
DPOTrainer
OnlineDPO ⚡
- NashMD ⚡
- XPOTrainer ⚡
GRPOTrainer ⚡
KTOTrainer
ORPOTrainer
PPOTrainer
PRMTrainer
RewardTrainer
RLOOTrainer ⚡
SFTTrainer
- GKDTrainer

albertvillanova · 2025-10-04T07:42:07Z

Alternatively, do you know if the Hub supports mermaid?

graph LR
  root[TRL Trainers]

  BCO[BCOTrainer]
  CPO[CPOTrainer]
  DPO[DPOTrainer]
  OnlineDPO[OnlineDPO ⚡]

  %% Group NashMD and XPO together without visible box
  subgraph cluster_online_dpo[ ]
    style cluster_online_dpo fill:none,stroke:none
    NashMD[NashMD ⚡]
    XPO[XPOTrainer ⚡]
  end

  GRPO[GRPOTrainer ⚡]
  KTO[KTOTrainer]
  ORPO[ORPOTrainer]
  PPO[PPOTrainer]
  PRM[PRMTrainer]
  Reward[RewardTrainer]
  RLOO[RLOOTrainer ⚡]
  SFT[SFTTrainer]
  GKD[GKDTrainer]

  root --> BCO
  root --> CPO
  root --> DPO
  root --> OnlineDPO
  OnlineDPO --> NashMD
  OnlineDPO --> XPO
  root --> GRPO
  root --> KTO
  root --> ORPO
  root --> PPO
  root --> PRM
  root --> Reward
  root --> RLOO
  root --> SFT
  SFT --> GKD

qgallouedec · 2025-10-06T02:27:39Z

For taxonomy, organizing by method style rather than Python inheritance may be more informative. For example:

Online methods
- GRPOTrainer
- RLOOTrainer
- ...
Offline method
- SFTTrainer
- DPOTrainer
- ...
Reward modeling
- RewardTrainer
- PRMTrainer

wdyt?

qgallouedec · 2025-10-06T02:31:41Z

Also, at some point I'd like to see the inheritance from GKD to SFT removed. Once this is done, this "inheritance" taxonomy would become even less informative and would just be a list, with the exception of XPO and NashMD.

…xonomy_diagram

sergiopaniego · 2025-10-06T10:39:11Z

Alternatively, do you know if the Hub supports mermaid?

afaik, there's no support

updated organizing it based on method style!

qgallouedec · 2025-10-06T14:04:06Z

Better! A few things to fix:

PPO is online
I'd have a dedicated category for GKD "Knowledge distillation"

sergiopaniego · 2025-10-07T10:52:33Z

updated!

qgallouedec

Cool, thanks!

Add trainers taxonomy to docs

d1e8890

sergiopaniego added 2 commits October 2, 2025 15:41

Updated

01f38f8

Updated with background

c3bebe4

albertvillanova reviewed Oct 3, 2025

View reviewed changes

sergiopaniego added 7 commits October 6, 2025 11:58

Taxonomy by method style

195289a

Linked

09ce30a

As table

c0310f9

Update style

92bedbc

Merge branch 'main' into taxonomy_diagram

16577f6

Updated

b745a0d

Merge branch 'taxonomy_diagram' of github.com:huggingface/trl into ta…

06c79ab

…xonomy_diagram

sergiopaniego added 3 commits October 7, 2025 12:31

Merge branch 'main' into taxonomy_diagram

7ee6967

Updated

92fbe23

Updated

c2545f6

qgallouedec approved these changes Oct 7, 2025

View reviewed changes

sergiopaniego merged commit 452284b into main Oct 7, 2025
3 checks passed

sergiopaniego deleted the taxonomy_diagram branch October 7, 2025 14:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add trainers taxonomy to docs #4195

Add trainers taxonomy to docs #4195

Uh oh!

sergiopaniego commented Oct 2, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Oct 2, 2025

Uh oh!

qgallouedec commented Oct 2, 2025

Uh oh!

sergiopaniego commented Oct 2, 2025

Uh oh!

albertvillanova left a comment

Uh oh!

sergiopaniego commented Oct 3, 2025

Uh oh!

albertvillanova commented Oct 4, 2025

Uh oh!

qgallouedec commented Oct 6, 2025 •

edited

Loading

Uh oh!

qgallouedec commented Oct 6, 2025

Uh oh!

sergiopaniego commented Oct 6, 2025

Uh oh!

qgallouedec commented Oct 6, 2025

Uh oh!

sergiopaniego commented Oct 7, 2025

Uh oh!

qgallouedec left a comment

Uh oh!

Uh oh!

Uh oh!

Add trainers taxonomy to docs #4195

Add trainers taxonomy to docs #4195

Uh oh!

Conversation

sergiopaniego commented Oct 2, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 2, 2025

Uh oh!

qgallouedec commented Oct 2, 2025

Uh oh!

sergiopaniego commented Oct 2, 2025

Uh oh!

albertvillanova left a comment

Choose a reason for hiding this comment

Uh oh!

sergiopaniego commented Oct 3, 2025

Uh oh!

albertvillanova commented Oct 4, 2025

Uh oh!

qgallouedec commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qgallouedec commented Oct 6, 2025

Uh oh!

sergiopaniego commented Oct 6, 2025

Uh oh!

qgallouedec commented Oct 6, 2025

Uh oh!

sergiopaniego commented Oct 7, 2025

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

qgallouedec commented Oct 6, 2025 •

edited

Loading