[Bug]: Shape mismatch: EfficientAD #2769
-
Describe the bugI am facing a very strange issue: RuntimeError: shape mismatch: value tensor of shape [16778] cannot be broadcast Shape mismatch error using custom sizes for my own dataset. Note: This happens only when the training dataset is very large >1000. DatasetCustom ModelEfficientAD Steps to reproduce the behaviour:
OS informationOS information:
Expected behaviorThe model must train successfully, or the issue must be reproducible for smaller dataset. ScreenshotsNo response Pip/GitHubpip What version/branch did you use?24.2 Configuration YAML# anomalib==1.2.0
seed_everything: true
trainer:
accelerator: auto
strategy: auto
devices: 1
num_nodes: 1
precision: null
logger:
class_path: anomalib.loggers.AnomalibTensorBoardLogger
init_args:
save_dir: tb_logs
name: EffAd-S_expI
version: 11
log_graph: false
default_hp_metric: true
prefix: ''
sub_dir: null
comment: ''
purge_step: null
max_queue: 10
flush_secs: 120
filename_suffix: ''
callbacks:
- class_path: lightning.pytorch.callbacks.ModelCheckpoint
init_args:
dirpath: results/EfficientAd/sample/v0/checkpoints
filename: epoch_{epoch:03d}
monitor: image_F1Score
verbose: true
save_last: null
save_top_k: 3
save_weights_only: false
mode: max
auto_insert_metric_name: false
every_n_train_steps: null
train_time_interval: null
every_n_epochs: null
save_on_train_epoch_end: null
enable_version_counter: true
- class_path: lightning.pytorch.callbacks.EarlyStopping
init_args:
monitor: image_AUPR
min_delta: 0.005
patience: 5
verbose: true
mode: max
strict: true
check_finite: true
stopping_threshold: null
divergence_threshold: null
check_on_train_epoch_end: null
log_rank_zero_only: false
fast_dev_run: false
max_epochs: 25
min_epochs: null
max_steps: -1
min_steps: null
max_time: null
limit_train_batches: null
limit_val_batches: null
limit_test_batches: null
limit_predict_batches: null
overfit_batches: 0.0
val_check_interval: null
check_val_every_n_epoch: 1
num_sanity_val_steps: null
log_every_n_steps: null
enable_checkpointing: null
enable_progress_bar: null
enable_model_summary: null
accumulate_grad_batches: 1
gradient_clip_val: null
gradient_clip_algorithm: null
deterministic: null
benchmark: null
inference_mode: true
use_distributed_sampler: true
profiler: null
detect_anomaly: false
barebones: false
plugins: null
sync_batchnorm: false
reload_dataloaders_every_n_epochs: 0
optimizer: null
lr_scheduler: null
model:
class_path: anomalib.models.EfficientAd
init_args:
imagenet_dir: datasets/imagenette
teacher_out_channels: 384
model_size: S
lr: 0.0001
weight_decay: 1.0e-05
padding: false
pad_maps: true
data:
class_path: anomalib.data.Folder
init_args:
name: sample
normal_dir: train/good
root: datasets/sample
abnormal_dir: val/anomaly
normal_test_dir: val/good
mask_dir: null
normal_split_ratio: 0.0
extensions:
- .jpg
train_batch_size: 1
eval_batch_size: 1
num_workers: 0
image_size: null
transform: null
train_transform:
class_path: torchvision.transforms.v2.Compose
init_args:
transforms:
- class_path: torchvision.transforms.v2.Resize
init_args:
size:
- 1024
- 1024
interpolation: BILINEAR
max_size: null
antialias: true
eval_transform:
class_path: torchvision.transforms.v2.Compose
init_args:
transforms:
- class_path: torchvision.transforms.v2.Resize
init_args:
size:
- 1024
- 1024
interpolation: BILINEAR
max_size: null
antialias: true
test_split_mode: FROM_DIR
test_split_ratio: 0.2
val_split_mode: same_as_test
val_split_ratio: 0.5
seed: null
normalization:
normalization_method: MIN_MAX
task: classification
metrics:
image:
- AUROC
- F1Score
- F1Max
- AUPR
pixel: null
threshold:
class_path: anomalib.metrics.F1AdaptiveThreshold
init_args:
default_value: 0.5
thresholds: null
ignore_index: null
validate_args: true
compute_on_cpu: false
dist_sync_on_step: false
sync_on_compute: true
compute_with_cache: true
logging:
log_graph: false
default_root_dir: results
ckpt_path: null Logs/Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning)
WARNING:anomalib.utils.config:Anomalib currently does not support multi-gpu training. Setting devices to 1.
INFO:anomalib.models.components.base.anomaly_module:Initializing EfficientAd model.
INFO:anomalib.callbacks:Loading the callbacks
INFO:anomalib.engine.engine:Overriding num_sanity_val_steps from None with 0 for EfficientAd
GPU available: True (mps), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
WARNING:anomalib.metrics.f1_score:F1Score class exists for backwards compatibility. It will be removed in v1.1. Please use BinaryF1Score from torchmetrics instead
| Name | Type | Params | Mode
---------------------------------------------------------------------------
0 | model | EfficientAdModel | 8.1 M | train
1 | _transform | Compose | 0 | train
2 | normalization_metrics | MetricCollection | 0 | train
3 | image_threshold | F1AdaptiveThreshold | 0 | train
4 | pixel_threshold | F1AdaptiveThreshold | 0 | train
5 | image_metrics | AnomalibMetricCollection | 0 | train
6 | pixel_metrics | AnomalibMetricCollection | 0 | train
---------------------------------------------------------------------------
8.1 M Trainable params
0 Non-trainable params
8.1 M Total params
32.235 Total estimated model params size (MB)
47 Modules in train mode
7 Modules in eval mode
/Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=10` in the `DataLoader` to improve performance.
/Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=10` in the `DataLoader` to improve performance.
Training: | | 0/? [00:00<?, ?it/s]INFO:anomalib.models.image.efficient_ad.lightning_model:Load pretrained teacher model from pre_trained/efficientad_pretrained_weights/pretrained_teacher_small.pth
/Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torchvision/transforms/v2/_deprecated.py:42: UserWarning: The transform `ToTensor()` is deprecated and will be removed in a future release. Instead, please use `v2.Compose([v2.ToImage(), v2.ToDtype(torch.float32, scale=True)])`.Output is equivalent up to float precision.
warnings.warn(
Calculate teacher channel mean & std: 100%|██████████| 3830/3830 [11:10<00:00, 5.72it/s]
Epoch 0: 87%|████████▋ | 3331/3830 [2:55:32<26:17, 0.32it/s, v_num=kl_0, train_st_step=0.785, train_ae_step=0.169, train_stae_step=0.0339, train_loss_step=0.987]╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /Users/deya/PycharmProjects/Experiments_AD/anomaly_detection_anomalib/run_tr │
│ ain.py:10 in <module> │
│ │
│ 7 if __name__ == "__main__": │
│ 8 │ """Fetch the engine, model and datamodule from the config, and trai │
│ 9 │ engine, model, datamodule = Engine.from_config(CONFIG_PATH) │
│ ❱ 10 │ engine.train(datamodule=datamodule, model=model) │
│ 11 │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/anomalib/engi │
│ ne/engine.py:871 in train │
│ │
│ 868 │ │ │ # if the model is zero-shot or few-shot, we only need to │
│ 869 │ │ │ self.trainer.validate(model, val_dataloaders, None, verbo │
│ 870 │ │ else: │
│ ❱ 871 │ │ │ self.trainer.fit(model, train_dataloaders, val_dataloader │
│ 872 │ │ self.trainer.test(model, test_dataloaders, ckpt_path=ckpt_pat │
│ 873 │ │
│ 874 │ def export( │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/trainer.py:538 in fit │
│ │
│ 535 │ │ self.state.fn = TrainerFn.FITTING │
│ 536 │ │ self.state.status = TrainerStatus.RUNNING │
│ 537 │ │ self.training = True │
│ ❱ 538 │ │ call._call_and_handle_interrupt( │
│ 539 │ │ │ self, self._fit_impl, model, train_dataloaders, val_datal │
│ 540 │ │ ) │
│ 541 │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/call.py:47 in _call_and_handle_interrupt │
│ │
│ 44 │ try: │
│ 45 │ │ if trainer.strategy.launcher is not None: │
│ 46 │ │ │ return trainer.strategy.launcher.launch(trainer_fn, *args, │
│ ❱ 47 │ │ return trainer_fn(*args, **kwargs) │
│ 48 │ │
│ 49 │ except _TunerExitException: │
│ 50 │ │ _call_teardown_hook(trainer) │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/trainer.py:574 in _fit_impl │
│ │
│ 571 │ │ │ model_provided=True, │
│ 572 │ │ │ model_connected=self.lightning_module is not None, │
│ 573 │ │ ) │
│ ❱ 574 │ │ self._run(model, ckpt_path=ckpt_path) │
│ 575 │ │ │
│ 576 │ │ assert self.state.stopped │
│ 577 │ │ self.training = False │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/trainer.py:981 in _run │
│ │
│ 978 │ │ # ---------------------------- │
│ 979 │ │ # RUN THE TRAINER │
│ 980 │ │ # ---------------------------- │
│ ❱ 981 │ │ results = self._run_stage() │
│ 982 │ │ │
│ 983 │ │ # ---------------------------- │
│ 984 │ │ # POST-Training CLEAN UP │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/trainer.py:1025 in _run_stage │
│ │
│ 1022 │ │ │ with isolate_rng(): │
│ 1023 │ │ │ │ self._run_sanity_check() │
│ 1024 │ │ │ with torch.autograd.set_detect_anomaly(self._detect_anoma │
│ ❱ 1025 │ │ │ │ self.fit_loop.run() │
│ 1026 │ │ │ return None │
│ 1027 │ │ raise RuntimeError(f"Unexpected state {self.state}") │
│ 1028 │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/fit_loop.py:205 in run │
│ │
│ 202 │ │ while not self.done: │
│ 203 │ │ │ try: │
│ 204 │ │ │ │ self.on_advance_start() │
│ ❱ 205 │ │ │ │ self.advance() │
│ 206 │ │ │ │ self.on_advance_end() │
│ 207 │ │ │ │ self._restarting = False │
│ 208 │ │ │ except StopIteration: │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/fit_loop.py:363 in advance │
│ │
│ 360 │ │ │ ) │
│ 361 │ │ with self.trainer.profiler.profile("run_training_epoch"): │
│ 362 │ │ │ assert self._data_fetcher is not None │
│ ❱ 363 │ │ │ self.epoch_loop.run(self._data_fetcher) │
│ 364 │ │
│ 365 │ def on_advance_end(self) -> None: │
│ 366 │ │ trainer = self.trainer │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/training_epoch_loop.py:140 in run │
│ │
│ 137 │ │ self.on_run_start(data_fetcher) │
│ 138 │ │ while not self.done: │
│ 139 │ │ │ try: │
│ ❱ 140 │ │ │ │ self.advance(data_fetcher) │
│ 141 │ │ │ │ self.on_advance_end(data_fetcher) │
│ 142 │ │ │ │ self._restarting = False │
│ 143 │ │ │ except StopIteration: │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/training_epoch_loop.py:250 in advance │
│ │
│ 247 │ │ │ with trainer.profiler.profile("run_training_batch"): │
│ 248 │ │ │ │ if trainer.lightning_module.automatic_optimization: │
│ 249 │ │ │ │ │ # in automatic optimization, there can only be one │
│ ❱ 250 │ │ │ │ │ batch_output = self.automatic_optimization.run(tra │
│ 251 │ │ │ │ else: │
│ 252 │ │ │ │ │ batch_output = self.manual_optimization.run(kwargs │
│ 253 │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/optimization/automatic.py:190 in run │
│ │
│ 187 │ │ # ------------------------------ │
│ 188 │ │ # gradient update with accumulated gradients │
│ 189 │ │ else: │
│ ❱ 190 │ │ │ self._optimizer_step(batch_idx, closure) │
│ 191 │ │ │
│ 192 │ │ result = closure.consume_result() │
│ 193 │ │ if result.loss is None: │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/optimization/automatic.py:268 in _optimizer_step │
│ │
│ 265 │ │ │ self.optim_progress.optimizer.step.increment_ready() │
│ 266 │ │ │
│ 267 │ │ # model hook │
│ ❱ 268 │ │ call._call_lightning_module_hook( │
│ 269 │ │ │ trainer, │
│ 270 │ │ │ "optimizer_step", │
│ 271 │ │ │ trainer.current_epoch, │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/call.py:167 in _call_lightning_module_hook │
│ │
│ 164 │ pl_module._current_fx_name = hook_name │
│ 165 │ │
│ 166 │ with trainer.profiler.profile(f"[LightningModule]{pl_module.__clas │
│ ❱ 167 │ │ output = fn(*args, **kwargs) │
│ 168 │ │
│ 169 │ # restore current_fx when nested context │
│ 170 │ pl_module._current_fx_name = prev_fx_name │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/core/module.py:1306 in optimizer_step │
│ │
│ 1303 │ │ │ │ # Add your custom logic to run directly after `optimi │
│ 1304 │ │ │
│ 1305 │ │ """ │
│ ❱ 1306 │ │ optimizer.step(closure=optimizer_closure) │
│ 1307 │ │
│ 1308 │ def optimizer_zero_grad(self, epoch: int, batch_idx: int, optimiz │
│ 1309 │ │ """Override this method to change the default behaviour of `` │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/core/optimizer.py:153 in step │
│ │
│ 150 │ │ │ raise MisconfigurationException("When `optimizer.step(clos │
│ 151 │ │ │
│ 152 │ │ assert self._strategy is not None │
│ ❱ 153 │ │ step_output = self._strategy.optimizer_step(self._optimizer, c │
│ 154 │ │ │
│ 155 │ │ self._on_after_step() │
│ 156 │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/strategies/strategy.py:238 in optimizer_step │
│ │
│ 235 │ │ model = model or self.lightning_module │
│ 236 │ │ # TODO(fabric): remove assertion once strategy's optimizer_ste │
│ 237 │ │ assert isinstance(model, pl.LightningModule) │
│ ❱ 238 │ │ return self.precision_plugin.optimizer_step(optimizer, model=m │
│ 239 │ │
│ 240 │ def _setup_model_and_optimizers(self, model: Module, optimizers: L │
│ 241 │ │ """Setup a model and multiple optimizers together. │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/plugins/precision/precision.py:122 in optimizer_step │
│ │
│ 119 │ ) -> Any: │
│ 120 │ │ """Hook to run the optimizer step.""" │
│ 121 │ │ closure = partial(self._wrap_closure, model, optimizer, closur │
│ ❱ 122 │ │ return optimizer.step(closure=closure, **kwargs) │
│ 123 │ │
│ 124 │ def _clip_gradients( │
│ 125 │ │ self, │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/optim/l │
│ r_scheduler.py:75 in wrapper │
│ │
│ 72 │ │ │ │ instance = instance_ref() │
│ 73 │ │ │ │ instance._step_count += 1 │
│ 74 │ │ │ │ wrapped = func.__get__(instance, cls) │
│ ❱ 75 │ │ │ │ return wrapped(*args, **kwargs) │
│ 76 │ │ │ │
│ 77 │ │ │ # Note that the returned function here is no longer a bou │
│ 78 │ │ │ # so attributes like `__func__` and `__self__` no longer │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/optim/o │
│ ptimizer.py:391 in wrapper │
│ │
│ 388 │ │ │ │ │ │ │ │ f"{func} must return None or a tuple o │
│ 389 │ │ │ │ │ │ │ ) │
│ 390 │ │ │ │ │
│ ❱ 391 │ │ │ │ out = func(*args, **kwargs) │
│ 392 │ │ │ │ self._optimizer_step_code() │
│ 393 │ │ │ │ │
│ 394 │ │ │ │ # call optimizer step post hooks │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/optim/o │
│ ptimizer.py:76 in _use_grad │
│ │
│ 73 │ │ │ # see https://github.com/pytorch/pytorch/issues/104053 │
│ 74 │ │ │ torch.set_grad_enabled(self.defaults['differentiable']) │
│ 75 │ │ │ torch._dynamo.graph_break() │
│ ❱ 76 │ │ │ ret = func(self, *args, **kwargs) │
│ 77 │ │ finally: │
│ 78 │ │ │ torch._dynamo.graph_break() │
│ 79 │ │ │ torch.set_grad_enabled(prev_grad) │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/optim/a │
│ dam.py:148 in step │
│ │
│ 145 │ │ loss = None │
│ 146 │ │ if closure is not None: │
│ 147 │ │ │ with torch.enable_grad(): │
│ ❱ 148 │ │ │ │ loss = closure() │
│ 149 │ │ │
│ 150 │ │ for group in self.param_groups: │
│ 151 │ │ │ params_with_grad = [] │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/plugins/precision/precision.py:108 in _wrap_closure │
│ │
│ 105 │ │ consistent with the ``Precision`` subclasses that cannot pass │
│ 106 │ │ │
│ 107 │ │ """ │
│ ❱ 108 │ │ closure_result = closure() │
│ 109 │ │ self._after_closure(model, optimizer) │
│ 110 │ │ return closure_result │
│ 111 │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/optimization/automatic.py:144 in __call__ │
│ │
│ 141 │ │
│ 142 │ @override │
│ 143 │ def __call__(self, *args: Any, **kwargs: Any) -> Optional[Tensor]: │
│ ❱ 144 │ │ self._result = self.closure(*args, **kwargs) │
│ 145 │ │ return self._result.loss │
│ 146 │
│ 147 │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/utils/_ │
│ contextlib.py:115 in decorate_context │
│ │
│ 112 │ @functools.wraps(func) │
│ 113 │ def decorate_context(*args, **kwargs): │
│ 114 │ │ with ctx_factory(): │
│ ❱ 115 │ │ │ return func(*args, **kwargs) │
│ 116 │ │
│ 117 │ return decorate_context │
│ 118 │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/optimization/automatic.py:138 in closure │
│ │
│ 135 │ │ │ self._zero_grad_fn() │
│ 136 │ │ │
│ 137 │ │ if self._backward_fn is not None and step_output.closure_loss │
│ ❱ 138 │ │ │ self._backward_fn(step_output.closure_loss) │
│ 139 │ │ │
│ 140 │ │ return step_output │
│ 141 │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/optimization/automatic.py:239 in backward_fn │
│ │
│ 236 │ │ │ return None │
│ 237 │ │ │
│ 238 │ │ def backward_fn(loss: Tensor) -> None: │
│ ❱ 239 │ │ │ call._call_strategy_hook(self.trainer, "backward", loss, o │
│ 240 │ │ │
│ 241 │ │ return backward_fn │
│ 242 │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/call.py:319 in _call_strategy_hook │
│ │
│ 316 │ │ return None │
│ 317 │ │
│ 318 │ with trainer.profiler.profile(f"[Strategy]{trainer.strategy.__clas │
│ ❱ 319 │ │ output = fn(*args, **kwargs) │
│ 320 │ │
│ 321 │ # restore current_fx when nested context │
│ 322 │ pl_module._current_fx_name = prev_fx_name │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/strategies/strategy.py:212 in backward │
│ │
│ 209 │ │ assert self.lightning_module is not None │
│ 210 │ │ closure_loss = self.precision_plugin.pre_backward(closure_loss │
│ 211 │ │ │
│ ❱ 212 │ │ self.precision_plugin.backward(closure_loss, self.lightning_mo │
│ 213 │ │ │
│ 214 │ │ closure_loss = self.precision_plugin.post_backward(closure_los │
│ 215 │ │ self.post_backward(closure_loss) │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/plugins/precision/precision.py:72 in backward │
│ │
│ 69 │ │ │ \**kwargs: Keyword arguments for the same purpose as ``*ar │
│ 70 │ │ │
│ 71 │ │ """ │
│ ❱ 72 │ │ model.backward(tensor, *args, **kwargs) │
│ 73 │ │
│ 74 │ @override │
│ 75 │ def post_backward(self, tensor: Tensor, module: "pl.LightningModul │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/core/module.py:1101 in backward │
│ │
│ 1098 │ │ if self._fabric: │
│ 1099 │ │ │ self._fabric.backward(loss, *args, **kwargs) │
│ 1100 │ │ else: │
│ ❱ 1101 │ │ │ loss.backward(*args, **kwargs) │
│ 1102 │ │
│ 1103 │ def toggle_optimizer(self, optimizer: Union[Optimizer, LightningO │
│ 1104 │ │ """Makes sure only the gradients of the current optimizer's p │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/_tensor │
│ .py:525 in backward │
│ │
│ 522 │ │ │ │ create_graph=create_graph, │
│ 523 │ │ │ │ inputs=inputs, │
│ 524 │ │ │ ) │
│ ❱ 525 │ │ torch.autograd.backward( │
│ 526 │ │ │ self, gradient, retain_graph, create_graph, inputs=inputs │
│ 527 │ │ ) │
│ 528 │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/autogra │
│ d/__init__.py:267 in backward │
│ │
│ 264 │ # The reason we repeat the same comment below is that │
│ 265 │ # some Python versions print out the first line of a multi-line fu │
│ 266 │ # calls in the traceback and some print out the last line │
│ ❱ 267 │ _engine_run_backward( │
│ 268 │ │ tensors, │
│ 269 │ │ grad_tensors_, │
│ 270 │ │ retain_graph, │
│ │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/autogra │
│ d/graph.py:744 in _engine_run_backward │
│ │
│ 741 │ if attach_logging_hooks: │
│ 742 │ │ unregister_hooks = _register_logging_hooks_on_whole_graph(t_ou │
│ 743 │ try: │
│ ❱ 744 │ │ return Variable._execution_engine.run_backward( # Calls into │
│ 745 │ │ │ t_outputs, *args, **kwargs │
│ 746 │ │ ) # Calls into the C++ engine to run the backward pass │
│ 747 │ finally: │
╰──────────────────────────────────────────────────────────────────────────────╯
RuntimeError: shape mismatch: value tensor of shape [16778] cannot be broadcast
to indexing result of shape [17823]
Process finished with exit code 1 Code of Conduct
Any help would be appreciated, as I am stuck in this issue for a long time. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Actually, dimensions must be integer when divided by 64. In my own code I also adapt the auto encoder "class Encoder(nn.Module):" in "models/image/efficient_ad/torch_model.py". |
Beta Was this translation helpful? Give feedback.
Actually, dimensions must be integer when divided by 64.
Thus intead of "[4096, 2160].", it would be better to use [4096, 2176].
In my own code I also adapt the auto encoder "class Encoder(nn.Module):" in "models/image/efficient_ad/torch_model.py".
If you use larger images and do not adapt the auto-encoder; depth or the kernel size of the final layer, it will not see the whole image at once.