Skip to content

Feature Request: Per-pixel segmentation output from GPU renderer #1237

@pthangeda

Description

@pthangeda

The current rendering pipeline outputs RGB and depth per pixel, but there is no way to obtain per-pixel object identity. This is essential for many use-cases, e.g., vision-based RL, sim-to-real transfer, etc.

Proposed Solution

Add per-pixel geom ID output to the render megakernel. The renderer already computes geom_id for every ray hit — it just needs an additional output buffer to write it to.

Output format:

  • int32 per pixel
  • >= 0 for rigid geom IDs (MuJoCo geom index)
  • -1 for background (ray misses all geometry)
  • -2 for flex bodies

API additions:

  • render_seg parameter on create_render_context() (same pattern as render_rgb/render_depth)
  • seg_data, seg_adr, render_seg fields on RenderContext
  • get_segmentation() utility function to extract per-camera segmentation

This approach has no extra buffers allocated, no kernel writes when disabled and adds negligible overhead when enabled (one int32 write per pixel alongside the existing RGB/depth writes).

PR here: #1236

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions