Skip to content

Real2Edit2Real/Real2Edit2Real

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Real2Edit2Real: Generating Robotic Demonstrations via a 3D Control Interface

Yujie Zhao1,2*, Hongwei Fan1,2*, Di Chen3, Shengcong Chen3, Liliang Chen3, Xiaoqi Li1,2,
Guanghui Ren3, Hao Dong1,2,

1CFCS, School of Computer Science, Peking University, 2PKU-AgiBot Lab, 3AgiBot

(* indicates equal contribution)
CVPR 2026

This repository contains the official authors implementation associated with the paper "Real2Edit2Real: Generating Robotic Demonstrations via a 3D Control Interface".

Framework

📢 News

  • Mar 10, 2026: We released the code and the model weights.
  • Feb 21, 2026: Real2Edit2Real has been accepted by CVPR 2026. 🥳🥳
  • Dec 22, 2025: We released the arXiv and demo of Real2Edit2Real.

🛠️ Installation

git clone --recurse-submodules https://github.com/Real2Edit2Real/Real2Edit2Real.git

cd Real2Edit2Real
conda create -y -n r2e2r python=3.10
conda activate r2e2r
conda install -y nvidia/label/cuda-12.1.0::cuda-toolkit -c nvidia/label/cuda-12.1.0
conda install -y -c conda-forge gxx_linux-64=11.4 gcc_linux-64=11.4 aria2
bash scripts/installation/1_install_env.sh
bash scripts/installation/2_install_curobo.sh
# Set this flag if you experience slow download speeds:
# export USE_HF_MIRROR=true
bash scripts/installation/3_download_ckpts.sh

🔥 Quick Start

The data generation pipeline illustrated below is capable of running on an NVIDIA GeForce RTX 4090.

Downloading Example Data

# Set this flag if you experience slow download speeds:
# export USE_HF_MIRROR=true
bash scripts/installation/3_download_data.sh

Metric-scale Geometry Reconstruction

bash scripts/preprocess_demo.sh --config-path configs/mug_to_basket.yaml

Depth-reliable Spatial Editing

bash scripts/generate_demo.sh --config-path configs/mug_to_basket.yaml

3D-Controlled Video Generation

bash scripts/generate_demo_video.sh --config-path configs/mug_to_basket.yaml

🔥 Training

The training scripts illustrated below are capable of running on GPUs with 80GB VRAM.

Metric-VGGT Training

  1. Preparing the dataset
Dataset
- task-id
-- episode-id
--- frame-id
---- head_color
---- hand_left_color
---- hand_right_color
---- head_depth
---- hand_left_depth
---- hand_right_depth
---- head_extrinsic
---- hand_left_extrinsic
---- hand_right_extrinsic
---- head_intrinsic
---- hand_left_intrinsic
---- hand_right_intrinsic
  1. Downloading the pretrained VGGT
wget https://huggingface.co/facebook/VGGT-1B/resolve/main/model.pt -O checkpoints/vggt_base_model.pt
  1. Run the training script
cd vggt
bash train.sh

Video Generation Model Training

For training data, we use the metric-VGGT model to annotate the open-source dataset AgibotWorld-Beta with depth and camera pose labels.

We provide a reference data processing script: vggt/preprocess_agibot_dataset.py

  1. Preparing the dataset
Dataset
- observations
-- task-id
---episode-id
- parameters
- proprio_stats

Annotated-Result
- task-id
-- episode-id
--- head_depth_ori
--- hand_left_depth_ori
--- hand_right_depth_ori
--- head_depth_canny
--- hand_left_depth_canny
--- hand_right_depth_canny
--- head_extrinsic.npy
--- hand_left_extrinsic.npy
--- hand_right_extrinsic.npy

  1. Downloading the pretrained GE-Sim
wget https://modelscope.cn/models/agibot_world/Genie-Envisioner/resolve/master/ge_sim_cosmos_v0.1.safetensors -O checkpoints/ge_sim_cosmos_v0.1.safetensors
  1. Training
cd videogen
bash train.sh scripts/train_action_depth_canny_cosmos2.py --config_file configs/action_depth_canny_cosmos2.yaml

🧩 Acknowledgements

Thanks to these great repositories: DUSt3R, MASt3R, VGGT, DemoGen, Cosmos, GenieEnvisioner, Enerverse-AC, and many other inspiring works in the community.

✍️ Citation

@article{zhao2025real2edit2real,
      title={Real2Edit2Real: Generating Robotic Demonstrations via a 3D Control Interface}, 
      author={Yujie Zhao and Hongwei Fan and Di Chen and Shengcong Chen and Liliang Chen and Xiaoqi Li and Guanghui Ren and Hao Dong},
      year={2025},
      eprint={2512.19402},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2512.19402}, 
}

About

[CVPR 2026] Real2Edit2Real: Generating Robotic Demonstrations via a 3D Control Interface

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors