### 📚 The doc issue I need help extracting total-loss, PPO-loss, rewards per step, returns per step in RLHF-PPO implementation. ### Suggest a potential alternative/fix _No response_
📚 The doc issue
I need help extracting total-loss, PPO-loss, rewards per step, returns per step in RLHF-PPO implementation.
Suggest a potential alternative/fix
No response