Neural Packing: from Visual Sensing to Reinforcement Learning

This repository is the official implementation of the paper "Neural Packing: from Visual Sensing to Reinforcement Learning" (ACM Transactions on Graphics, SIGGRAPH Asia 2023). arXiv link

We propose TAP-Net++, a learning-based framework to solve the 3D Transport-and-Packing (TAP) problem. Unlike previous methods, our approach handles the full pipeline from visual sensing of casually stacked objects to robotic packing, optimizing both object selection and placement location (EMS) simultaneously.

For more details and materials, please refer to our project page.

🚀 Installation

Prerequisites

Python 3.8+
PyTorch 1.10+
CUDA 11.3+ (for GPU acceleration)

Dependencies

Install required packages:

Option 1: Using requirements.txt

pip install -r requirements.txt

Option 2: Manual installation

pip install torch
pip install gymnasium tianshou numpy matplotlib tqdm tensorboard scipy

Clone Repository

git clone https://github.com/Juzhan/Neural-Packing.git
cd Neural-Packing

🏃 Quick Start

Training

To train the TAP-Net++ model with default parameters:

./train.sh

Testing

To test a pre-trained model:

./test.sh

📖 Usage

Command Line Arguments

The main training script main.py supports numerous configuration options:

python main.py --task tapnet/TAP-v0 \
                    --model tnpp \
                    --box-num 20 \
                    --container-size 100 100 100 \
                    --box-range 10 80 \
                    --fact-type tap_fake \
                    --prec-type attn \
                    --data-type rand \
                    --rotate-axes x y z \
                    --world-type real \
                    --container-type single \
                    --pack-type last \
                    --stable-predict 1 \
                    --reward-type C \
                    --max-epoch 100 \
                    --step-per-epoch 2000 \
                    --device cuda

Key Parameters

--model: Model architecture, 'tnpp' means tap-net++ (tnpp, tn, greedy)
--fact-type: Problem type (tap_fake for precedence-aware, box for standard ordering packing)
--prec-type: Precedence encoding (attn, cnn, rnn, none)
--data-type: Data generation type (rand for random, fix for fixed, ppsg for Perfect Packing Strategy Guaranteed)
--rotate-axes: Allowed rotation axes (x, y, z combinations, e.g., only z for 90-degree rotations around Z-axis)
--world-type: Simulation type (real with stability, ideal without)
--container-type: single or multi containers
--pack-type: Packing strategy (all for all containers, last for last container only)
--stable-rule: Stability rule (hard_after_pack for hard stability after packing)
--stable-predict: Whether to predict stability (0 or 1)
--reward-type: Reward formulation (C for compactness, E for each step reward, etc.)

🧠 Algorithm Overview

TAP-Net++ Architecture

TAP-Net++ consists of three main components:

Object Encoder: Encodes box dimensions and precedence relationships
Space Encoder: Encodes Empty Maximum Spaces (EMS) for placement
Cross-Transformer: Learns interactions between objects and spaces

Reinforcement Learning Formulation

State: Box states, EMS, precedence masks, heightmaps
Action: Joint selection of (box, rotation, EMS, corner)
Reward: Compactness ratio, stability penalty, container count

📁 Code Structure

Neural-Packing/
├── main.py              # Main training script with argument parsing
├── test.sh                   # Shell script for testing
├── train.sh                  # Shell script for training
├── tap/                   # Core package
│   ├── __init__.py
│   ├── gym_tap.py            # Gymnasium environment registration
│   ├── envs/                 # Environment implementation
│   │   ├── __init__.py
│   │   ├── env.py            # Main TAP environment
│   │   ├── container.py      # Container management
│   │   ├── factory.py        # Box generation and management
│   │   ├── ems_tools.py      # EMS computation utilities
│   │   ├── convex_hull.py    # Geometry utilities
│   │   └── space.py          # Space representation
├── models/               # Neural network models
│   ├── __init__.py
│   ├── network.py        # Main TAP-Net++ architecture
│   ├── attention.py      # Cross-transformer implementation
│   ├── encoder.py        # Object and space encoders
│   ├── greedy.py         # Greedy baseline
│   └── old.py            # Legacy models [TAP-Net]
├── render/                   # Visualization tools
├── checkpoints/             # Saved model checkpoints
└── README.md               # This file

🏋️ Training

Training Process

Environment Setup: Creates vectorized environments for parallel training
Data Collection: Uses Tianshou's Collector to gather experience
Policy Optimization: Applies PPO/A2C updates with advantage estimation
Checkpointing: Saves best policies and periodic checkpoints

Monitoring

Training progress can be monitored using TensorBoard:

tensorboard --logdir ./log

Hyperparameters

Default training parameters (can be adjusted in main.py):

Learning rate: 3e-4
Buffer size: 2048
Batch size: 128
PPO clip epsilon: 0.2
Discount factor (gamma): 0.99
GAE lambda: 0.95

🧪 Testing

Running Tests

python main.py --train 0 --resume-path ./checkpoints/policy.pth

TAP Benchmark

Benchmark Data Generation

We provide a data generation script to create benchmark datasets for evaluating TAP-Net++ performance across different data types. The script generates three types of box data:

Random (rand): Randomly generated boxes within the specified size range
Fixed (fix): Boxes sampled from a fixed set of candidate boxes
Perfect Packing Strategy Guaranteed (ppsg): Boxes generated using perfect packing strategy

Generating Benchmark Data

Use the gen_benchmark.py script to create benchmark datasets:

python gen_benchmark.py --num-samples 100 --output-dir ./benchmark_data

This will generate 100 samples for each data type (rand, fix, ppsg) using default parameters from main.py:

Container size: [100, 100, 100]
Box range: [10, 80]
Box number: 20
Fact type: tap_fake
Other parameters: Default values from main.py

Data Structure

Generated data is saved in the following directory structure:

benchmark_data/
├── tap_fake/
│   ├── rand/
│   │   └── 20/
│   │       └── [100_100]_[10_80]_10/
│   │           ├── 0_box.npy
│   │           ├── 0_pre.npy
│   │           ├── 1_box.npy
│   │           ├── 1_pre.npy
│   │           ├── ...
│   │           └── metadata.npy
│   ├── fix/
│   │   └── ...
│   └── ppsg/
│       └── ...

Each sample consists of:

{index}_box.npy: Box dimensions (list of [width, length, height] for each box)
{index}_pre.npy: Precedence graph for tap_fake fact type
metadata.npy: Configuration metadata for the dataset

Customizing Data Generation

You can customize the data generation by modifying the script arguments:

# Generate only specific data types
python gen_benchmark.py --data-types rand fix

# Generate different number of samples
python gen_benchmark.py --num-samples 500

# Specify custom output directory
python gen_benchmark.py --output-dir ./my_benchmark_data

🔗 Citation

If you find our work useful in your research, please cite:

@article{Xu2023NeuralPacking,
  title={Neural Packing: from Visual Sensing to Reinforcement Learning},
  author={Xu, Juzhan and Gong, Minglun and Zhang, Hao and Huang, Hui and Hu, Ruizhen},
  journal={ACM Transactions on Graphics (TOG)},
  volume={42},
  number={6},
  pages={Article 269},
  year={2023},
  publisher={ACM}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Packing: from Visual Sensing to Reinforcement Learning

🚀 Installation

Prerequisites

Dependencies

Clone Repository

🏃 Quick Start

Training

Testing

📖 Usage

Command Line Arguments

Key Parameters

🧠 Algorithm Overview

TAP-Net++ Architecture

Reinforcement Learning Formulation

📁 Code Structure

🏋️ Training

Training Process

Monitoring

Hyperparameters

🧪 Testing

Running Tests

TAP Benchmark

Benchmark Data Generation

Generating Benchmark Data

Data Structure

Customizing Data Generation

🔗 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
checkpoints		checkpoints
doc		doc
models		models
render		render
tap		tap
.gitignore		.gitignore
README.md		README.md
gen_benchmark.py		gen_benchmark.py
main.py		main.py
requirements.txt		requirements.txt
test.sh		test.sh
train.sh		train.sh

Folders and files

Latest commit

History

Repository files navigation

Neural Packing: from Visual Sensing to Reinforcement Learning

🚀 Installation

Prerequisites

Dependencies

Clone Repository

🏃 Quick Start

Training

Testing

📖 Usage

Command Line Arguments

Key Parameters

🧠 Algorithm Overview

TAP-Net++ Architecture

Reinforcement Learning Formulation

📁 Code Structure

🏋️ Training

Training Process

Monitoring

Hyperparameters

🧪 Testing

Running Tests

TAP Benchmark

Benchmark Data Generation

Generating Benchmark Data

Data Structure

Customizing Data Generation

🔗 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages