Skip to content

Point transformer traning crashes if shape is not [n, 16] #683

@RawatAGit

Description

@RawatAGit

Checklist

Describe the issue

I have been trying to train my custom dataset on point transformer using torch pipeline but training crashes due less number of neighbours.

Steps to reproduce the bug

part of config with many number of points: 
train_dir: train
  val_dir: [val, test]
  test_dir: []
  dataset_info: dataset_info.json
  sampler:
    name: SemSegRandomSampler
  test_result_folder: ./test
  min_points: 32768
  padding_noise_std: 0.01

Error message

@pop-os:~/Documents/HPC pipeline$ python training_pipeline/open3dml-clone/scripts/run_pipeline.py torch
-c training_pipeline/configs/pointtransformer_helios.yml
--dataset.dataset_path data_processing/helios_open3dml_data
--pipeline SemanticSegmentation
--dataset.use_cache True
--device cuda
--pipeline.log_level DEBUG
Using external Open3D-ML in /homeHPC pipeline/training_pipeline/open3dml-clone
regular arguments
backend: gloo
batch_size: null
cfg_dataset: null
cfg_file: training_pipeline/configs/pointtransformer_helios.yml
cfg_model: null
cfg_pipeline: null
ckpt_path: null
dataset: null
dataset_path: null
device: cuda
device_ids:

  • '0'
    framework: torch
    host: localhost
    main_log_dir: null
    max_epochs: null
    mode: null
    model: null
    node_rank: 0
    nodes: 1
    pipeline: SemanticSegmentation
    port: '12355'
    seed: 0
    split: train

extra arguments
dataset.dataset_path: data_processing/helios_open3dml_data
dataset.use_cache: 'True'
pipeline.log_level: DEBUG

INFO - 2025-09-26 14:34:01,560 - heliosconstruction - HeliosConstruction dataset initialized with 152 train / 48 val / 0 test files.
INFO - 2025-09-26 14:34:01,703 - semantic_segmentation - DEVICE : cuda
INFO - 2025-09-26 14:34:01,703 - semantic_segmentation - Logging in file : ./logs/PointTransformer_HeliosConstruction_torch/log_train_2025-09-26_14-34-01.txt
INFO - 2025-09-26 14:34:01,704 - heliosconstruction - Found 152 point clouds for train
INFO - 2025-09-26 14:34:01,705 - heliosconstruction - Found 48 point clouds for validation
INFO - 2025-09-26 14:34:01,956 - semantic_segmentation - Initializing from scratch.
INFO - 2025-09-26 14:34:01,957 - semantic_segmentation - Writing summary in train_log/00020_PointTransformer_HeliosConstruction_torch.
INFO - 2025-09-26 14:34:01,957 - semantic_segmentation - Started training
INFO - 2025-09-26 14:34:01,957 - semantic_segmentation - === EPOCH 0/200 ===
training: 0%| | 0/76 [00:00<?, ?it/s]/home/Documents/HPC pipeline/training_pipeline/open3dml-clone/ml3d/datasets/augment/augmentation.py:78: UserWarning: It is recommended to recenter the pointcloud before calling rotate.
warnings.warn(
training: 0%| | 0/76 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/Documents/HPC pipeline/training_pipeline/open3dml-clone/scripts/run_pipeline.py", line 261, in
sys.exit(main())
^^^^^^
File "/home/Documents/HPC pipeline/training_pipeline/open3dml-clone/scripts/run_pipeline.py", line 192, in main
pipeline.run_train()
File "/home/Documents/HPC pipeline/training_pipeline/open3dml-clone/ml3d/torch/pipelines/semantic_segmentation.py", line 416, in run_train
results = model(inputs['data'])
^^^^^^^^^^^^^^^^^^^^^
File "/home/miniconda3/envs/Helios/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/miniconda3/envs/Helios/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home//Documents/HPC pipeline/training_pipeline/open3dml-clone/ml3d/torch/models/point_transformer.py", line 175, in forward
p, f, r = self.encoders[i]([points[i], feats[i], row_splits[i]])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/miniconda3/envs/Helios/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/miniconda3/envs/Helios/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/miniconda3/envs/Helios/lib/python3.11/site-packages/torch/nn/modules/container.py", line 217, in forward
input = module(input)
^^^^^^^^^^^^^
File "/home/miniconda3/envs/Helios/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/miniconda3/envs/Helios/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/Documents/HPC pipeline/training_pipeline/open3dml-clone/ml3d/torch/models/point_transformer.py", line 643, in forward
feat = self.relu(self.bn2(self.transformer2([point, feat, row_splits])))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/miniconda3/envs/Helios/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/miniconda3/envs/Helios/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/Documents/HPC pipeline/training_pipeline/open3dml-clone/ml3d/torch/models/point_transformer.py", line 430, in forward
feat_k = queryandgroup(self.nsample,
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/Documents/HPC pipeline/training_pipeline/open3dml-clone/ml3d/torch/models/point_transformer.py", line 680, in queryandgroup
idx = knn_batch(points,
^^^^^^^^^^^^^^^^^
File "/home/Documents/HPC pipeline/training_pipeline/open3dml-clone/ml3d/torch/models/point_transformer.py", line 734, in knn_batch
return ans.neighbors_index.reshape(-1, k).long().cuda()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: shape '[-1, 16]' is invalid for input of size 345

Expected behavior

No response

Open3D, Python and System information

- Operating system: Ubuntu 22.04, 
- Python version: Python 3.11
- Open3D version: 0.18.0
- System type: 
- Is this remote workstation?: no
- How did you install Open3D?: pip
- Compiler version (if built from source):

Additional information

The crash comes from knn_search in the current Open3D release returning ragged neighbor lists—some queries yield fewer than 16 neighbours, so Open3D-ML’s
PointTransformer (which assumes a fixed k) can’t reshape the tensor. That behaviour changed in Open3D ≥0.18; the upstream Open3D-ML master hasn’t patched this branch
yet,

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions