This repository is the official implementation of the paper "Neural Packing: from Visual Sensing to Reinforcement Learning" (ACM Transactions on Graphics, SIGGRAPH Asia 2023). arXiv link
We propose TAP-Net++, a learning-based framework to solve the 3D Transport-and-Packing (TAP) problem. Unlike previous methods, our approach handles the full pipeline from visual sensing of casually stacked objects to robotic packing, optimizing both object selection and placement location (EMS) simultaneously.
For more details and materials, please refer to our project page.
- Python 3.8+
- PyTorch 1.10+
- CUDA 11.3+ (for GPU acceleration)
Install required packages:
Option 1: Using requirements.txt
pip install -r requirements.txtOption 2: Manual installation
pip install torch
pip install gymnasium tianshou numpy matplotlib tqdm tensorboard scipygit clone https://github.com/Juzhan/Neural-Packing.git
cd Neural-PackingTo train the TAP-Net++ model with default parameters:
./train.shTo test a pre-trained model:
./test.shThe main training script main.py supports numerous configuration options:
python main.py --task tapnet/TAP-v0 \
--model tnpp \
--box-num 20 \
--container-size 100 100 100 \
--box-range 10 80 \
--fact-type tap_fake \
--prec-type attn \
--data-type rand \
--rotate-axes x y z \
--world-type real \
--container-type single \
--pack-type last \
--stable-predict 1 \
--reward-type C \
--max-epoch 100 \
--step-per-epoch 2000 \
--device cuda--model: Model architecture, 'tnpp' means tap-net++ (tnpp,tn,greedy)--fact-type: Problem type (tap_fakefor precedence-aware,boxfor standard ordering packing)--prec-type: Precedence encoding (attn,cnn,rnn,none)--data-type: Data generation type (randfor random,fixfor fixed,ppsgfor Perfect Packing Strategy Guaranteed)--rotate-axes: Allowed rotation axes (x,y,zcombinations, e.g., onlyzfor 90-degree rotations around Z-axis)--world-type: Simulation type (realwith stability,idealwithout)--container-type:singleormulticontainers--pack-type: Packing strategy (allfor all containers,lastfor last container only)--stable-rule: Stability rule (hard_after_packfor hard stability after packing)--stable-predict: Whether to predict stability (0 or 1)--reward-type: Reward formulation (Cfor compactness,Efor each step reward, etc.)
TAP-Net++ consists of three main components:
- Object Encoder: Encodes box dimensions and precedence relationships
- Space Encoder: Encodes Empty Maximum Spaces (EMS) for placement
- Cross-Transformer: Learns interactions between objects and spaces
- State: Box states, EMS, precedence masks, heightmaps
- Action: Joint selection of (box, rotation, EMS, corner)
- Reward: Compactness ratio, stability penalty, container count
Neural-Packing/
βββ main.py # Main training script with argument parsing
βββ test.sh # Shell script for testing
βββ train.sh # Shell script for training
βββ tap/ # Core package
β βββ __init__.py
β βββ gym_tap.py # Gymnasium environment registration
β βββ envs/ # Environment implementation
β β βββ __init__.py
β β βββ env.py # Main TAP environment
β β βββ container.py # Container management
β β βββ factory.py # Box generation and management
β β βββ ems_tools.py # EMS computation utilities
β β βββ convex_hull.py # Geometry utilities
β β βββ space.py # Space representation
βββ models/ # Neural network models
β βββ __init__.py
β βββ network.py # Main TAP-Net++ architecture
β βββ attention.py # Cross-transformer implementation
β βββ encoder.py # Object and space encoders
β βββ greedy.py # Greedy baseline
β βββ old.py # Legacy models [TAP-Net]
βββ render/ # Visualization tools
βββ checkpoints/ # Saved model checkpoints
βββ README.md # This file
- Environment Setup: Creates vectorized environments for parallel training
- Data Collection: Uses Tianshou's
Collectorto gather experience - Policy Optimization: Applies PPO/A2C updates with advantage estimation
- Checkpointing: Saves best policies and periodic checkpoints
Training progress can be monitored using TensorBoard:
tensorboard --logdir ./logDefault training parameters (can be adjusted in main.py):
- Learning rate: 3e-4
- Buffer size: 2048
- Batch size: 128
- PPO clip epsilon: 0.2
- Discount factor (gamma): 0.99
- GAE lambda: 0.95
python main.py --train 0 --resume-path ./checkpoints/policy.pthWe provide a data generation script to create benchmark datasets for evaluating TAP-Net++ performance across different data types. The script generates three types of box data:
- Random (rand): Randomly generated boxes within the specified size range
- Fixed (fix): Boxes sampled from a fixed set of candidate boxes
- Perfect Packing Strategy Guaranteed (ppsg): Boxes generated using perfect packing strategy
Use the gen_benchmark.py script to create benchmark datasets:
python gen_benchmark.py --num-samples 100 --output-dir ./benchmark_dataThis will generate 100 samples for each data type (rand, fix, ppsg) using default parameters from main.py:
- Container size: [100, 100, 100]
- Box range: [10, 80]
- Box number: 20
- Fact type: tap_fake
- Other parameters: Default values from main.py
Generated data is saved in the following directory structure:
benchmark_data/
βββ tap_fake/
β βββ rand/
β β βββ 20/
β β βββ [100_100]_[10_80]_10/
β β βββ 0_box.npy
β β βββ 0_pre.npy
β β βββ 1_box.npy
β β βββ 1_pre.npy
β β βββ ...
β β βββ metadata.npy
β βββ fix/
β β βββ ...
β βββ ppsg/
β βββ ...
Each sample consists of:
{index}_box.npy: Box dimensions (list of [width, length, height] for each box){index}_pre.npy: Precedence graph for tap_fake fact typemetadata.npy: Configuration metadata for the dataset
You can customize the data generation by modifying the script arguments:
# Generate only specific data types
python gen_benchmark.py --data-types rand fix
# Generate different number of samples
python gen_benchmark.py --num-samples 500
# Specify custom output directory
python gen_benchmark.py --output-dir ./my_benchmark_dataIf you find our work useful in your research, please cite:
@article{Xu2023NeuralPacking,
title={Neural Packing: from Visual Sensing to Reinforcement Learning},
author={Xu, Juzhan and Gong, Minglun and Zhang, Hao and Huang, Hui and Hu, Ruizhen},
journal={ACM Transactions on Graphics (TOG)},
volume={42},
number={6},
pages={Article 269},
year={2023},
publisher={ACM}
}
