Skip to content

Questions about unusual hyperparameter values (gamma, clip_param, learning_rate) #2

@Jinghan11

Description

@Jinghan11

Hi, and thanks for the great project!

I noticed some unusual hyperparameter values in the PPO config and wanted to understand the rationale behind them:

algorithm = RslRlPpoAlgorithmCfg(
    value_loss_coef=1.0,
    use_clipped_value_loss=True,
    clip_param=0.12143,  # 0.2,
    entropy_coef=0.01,
    num_learning_epochs=5,
    num_mini_batches=4,
    learning_rate=0.00041,  # 1.e-5,
    schedule="adaptive",
    gamma=0.9751,  # 0.99,
    lam=0.95,
    desired_kl=0.01,
    max_grad_norm=1.,
)

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions