-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Before Asking
-
I have read the README carefully. 我已经仔细阅读了README上的操作指引。
-
I want to train my custom dataset, and I have read the tutorials for training your custom data carefully and organize my dataset correctly; (FYI: We recommand you to apply the config files of xx_finetune.py.) 我想训练自定义数据集,我已经仔细阅读了训练自定义数据的教程,以及按照正确的目录结构存放数据集。(FYI: 我们推荐使用xx_finetune.py等配置文件训练自定义数据集。)
-
I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。
Search before asking
- I have searched the YOLOv6 issues and found no similar questions.
Question
I meet the issue after training few epochs, and I use fuse_ab for training.
I suspect that this issue is due to no activation function for bbox output, and I wonder the reason and how it workable.
Additional
../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [147,0,0], thread: [60,0,0] Assertion `input_val >= zero && input_val <= one` failed.
../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [147,0,0], thread: [61,0,0] Assertion `input_val >= zero && input_val <= one` failed.
../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [147,0,0], thread: [62,0,0] Assertion `input_val >= zero && input_val <= one` failed.
../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [147,0,0], thread: [63,0,0] Assertion `input_val >= zero && input_val <= one` failed.
2/399 0.02 nan 0.7267 nan 0.08989: 84%|████████▍ | 42774/50836 [3:19:03
ERROR in training steps.
ERROR in training loop or eval/save model.
Traceback (most recent call last):
File "/dataset/YOLOv6/yolov6/core/engine.py", line 121, in train
self.train_one_epoch(self.epoch)
File "/dataset/YOLOv6/yolov6/core/engine.py", line 135, in train_one_epoch
self.train_in_steps(epoch_num, self.step)
File "/dataset/YOLOv6/yolov6/core/engine.py", line 162, in train_in_steps
total_loss, loss_items = self.compute_loss((preds[0],preds[4],preds[5], preds[6]), targets, epoch_num,
File "/dataset/YOLOv6/yolov6/models/losses/loss.py", line 178, in __call__
loss_iou, loss_dfl = self.bbox_loss(pred_distri, pred_bboxes, anchor_points_s, target_bboxes,
File "/home/kasm-user/anaconda3/envs/onepp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/dataset/YOLOv6/yolov6/models/losses/loss.py", line 238, in forward
pred_bboxes_pos = torch.masked_select(pred_bboxes,
RuntimeError: CUDA error: device-side assert triggered
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested