Skip to content

Conversation

sergiopaniego
Copy link
Member

What does this PR do?

Add the trainers taxonomy to the documentation, including details on inheritance and online support.
I've added the diagram here and if approved, I'll move it to documentation-images before merging.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a GitHub issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Who can review?

@qgallouedec

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@qgallouedec
Copy link
Member

OnlineDPO doesn't inherit from DOO

@sergiopaniego
Copy link
Member Author

lol updated!

Copy link
Member

@albertvillanova albertvillanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!!

I would just raise one concern about maintainability: every time we add/rename/remove a trainer, this image would need to be regenerated and updated accordingly.

What do you think?

@sergiopaniego
Copy link
Member Author

I would just raise one concern about maintainability: every time we add/rename/remove a trainer, this image would need to be regenerated and updated accordingly.

I understand your concern and am open to your thoughts. I don't think the addition of trainers happens frequently enough to be an issue. My intention in adding this taxonomy was to make it easier for beginners to understand the dependencies without having to navigate the code. and showcase online support via vLLM. Perhaps a simple list like the following could work as well:

(⚡ = online support via vLLM)

  • BCOTrainer
  • CPOTrainer
  • DPOTrainer
  • OnlineDPO ⚡
    • NashMD ⚡
    • XPOTrainer ⚡
  • GRPOTrainer ⚡
  • KTOTrainer
  • ORPOTrainer
  • PPOTrainer
  • PRMTrainer
  • RewardTrainer
  • RLOOTrainer ⚡
  • SFTTrainer
    • GKDTrainer

@albertvillanova
Copy link
Member

Alternatively, do you know if the Hub supports mermaid?

graph LR
  root[TRL Trainers]

  BCO[BCOTrainer]
  CPO[CPOTrainer]
  DPO[DPOTrainer]
  OnlineDPO[OnlineDPO ⚡]

  %% Group NashMD and XPO together without visible box
  subgraph cluster_online_dpo[ ]
    style cluster_online_dpo fill:none,stroke:none
    NashMD[NashMD ⚡]
    XPO[XPOTrainer ⚡]
  end

  GRPO[GRPOTrainer ⚡]
  KTO[KTOTrainer]
  ORPO[ORPOTrainer]
  PPO[PPOTrainer]
  PRM[PRMTrainer]
  Reward[RewardTrainer]
  RLOO[RLOOTrainer ⚡]
  SFT[SFTTrainer]
  GKD[GKDTrainer]

  root --> BCO
  root --> CPO
  root --> DPO
  root --> OnlineDPO
  OnlineDPO --> NashMD
  OnlineDPO --> XPO
  root --> GRPO
  root --> KTO
  root --> ORPO
  root --> PPO
  root --> PRM
  root --> Reward
  root --> RLOO
  root --> SFT
  SFT --> GKD
Loading

@qgallouedec
Copy link
Member

qgallouedec commented Oct 6, 2025

For taxonomy, organizing by method style rather than Python inheritance may be more informative. For example:

  • Online methods
    • GRPOTrainer
    • RLOOTrainer
    • ...
  • Offline method
    • SFTTrainer
    • DPOTrainer
    • ...
  • Reward modeling
    • RewardTrainer
    • PRMTrainer

wdyt?

@qgallouedec
Copy link
Member

Also, at some point I'd like to see the inheritance from GKD to SFT removed. Once this is done, this "inheritance" taxonomy would become even less informative and would just be a list, with the exception of XPO and NashMD.

@sergiopaniego
Copy link
Member Author

Alternatively, do you know if the Hub supports mermaid?

afaik, there's no support

updated organizing it based on method style!

@qgallouedec
Copy link
Member

Better! A few things to fix:

  • PPO is online
  • I'd have a dedicated category for GKD "Knowledge distillation"

@sergiopaniego
Copy link
Member Author

updated!

Copy link
Member

@qgallouedec qgallouedec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks!

@sergiopaniego sergiopaniego merged commit 452284b into main Oct 7, 2025
3 checks passed
@sergiopaniego sergiopaniego deleted the taxonomy_diagram branch October 7, 2025 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants