Skip to content

PPO Complex Obs/Action Space #353

@ttumiel

Description

@ttumiel

Problem Description

Would it be useful to add a complex (nested/dictionary) action and obs space variant of the PPO algo? I did this for minerl and wondered if it would be useful to contribute into the main library? I'd happily make a PR.

Checklist

Current Behavior

Currently PPO only supports continuous or discrete actions separately and a single array observation.

Expected Behavior

PPO can support arbitrary complex action and observation spaces.

Possible Solution

  • Use tree to map over actions and observation.
  • Store arrays in the same struct shape as the obs space or flatten them for storage and unflatten when passing to the network.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions