[Bug]: Shape mismatch: EfficientAD #2769

deyantara1303 · 2025-03-27T13:01:05Z

deyantara1303
Mar 27, 2025

Describe the bug

I am facing a very strange issue:

RuntimeError: shape mismatch: value tensor of shape [16778] cannot be broadcast
to indexing result of shape [17823]

Shape mismatch error using custom sizes for my own dataset.
The original image size is [4096, 2160]. By default, EfficientAD transforms to [256, 256].
However, I tried with size [512, 512] and [1024, 1024] to retain fine-grained anomalies, as they are getting lost in a low resolution. The model fails at random epochs during backpropagation.

Note: This happens only when the training dataset is very large >1000.
I do not face this issue with a smaller dataset or with 7000 images with lower resolution size [256, 256] or [512, 240].

Dataset

Custom

Model

EfficientAD

Steps to reproduce the behaviour:

Install anomalib 1.2.0
Create model and custom dataset > 3000 with EfficientAD.
Apply train and eval size transforms.

OS information

OS information:

OS: MacOS
Python version: 3.10.15
Anomalib version: 1.2.0
PyTorch version: 2.3.0
CUDA/cuDNN version: MPS
GPU models and configuration: Apple Silicon M3
Any other relevant information: I am using custom dataset

Expected behavior

The model must train successfully, or the issue must be reproducible for smaller dataset.

Screenshots

No response

Pip/GitHub

pip

What version/branch did you use?

24.2

Configuration YAML

# anomalib==1.2.0
seed_everything: true
trainer:
  accelerator: auto
  strategy: auto
  devices: 1
  num_nodes: 1
  precision: null
  logger:
    class_path: anomalib.loggers.AnomalibTensorBoardLogger
    init_args:
      save_dir: tb_logs
      name: EffAd-S_expI
      version: 11
      log_graph: false
      default_hp_metric: true
      prefix: ''
      sub_dir: null
      comment: ''
      purge_step: null
      max_queue: 10
      flush_secs: 120
      filename_suffix: ''
  callbacks:
  - class_path: lightning.pytorch.callbacks.ModelCheckpoint
    init_args:
      dirpath: results/EfficientAd/sample/v0/checkpoints
      filename: epoch_{epoch:03d}
      monitor: image_F1Score
      verbose: true
      save_last: null
      save_top_k: 3
      save_weights_only: false
      mode: max
      auto_insert_metric_name: false
      every_n_train_steps: null
      train_time_interval: null
      every_n_epochs: null
      save_on_train_epoch_end: null
      enable_version_counter: true
  - class_path: lightning.pytorch.callbacks.EarlyStopping
    init_args:
      monitor: image_AUPR
      min_delta: 0.005
      patience: 5
      verbose: true
      mode: max
      strict: true
      check_finite: true
      stopping_threshold: null
      divergence_threshold: null
      check_on_train_epoch_end: null
      log_rank_zero_only: false
  fast_dev_run: false
  max_epochs: 25
  min_epochs: null
  max_steps: -1
  min_steps: null
  max_time: null
  limit_train_batches: null
  limit_val_batches: null
  limit_test_batches: null
  limit_predict_batches: null
  overfit_batches: 0.0
  val_check_interval: null
  check_val_every_n_epoch: 1
  num_sanity_val_steps: null
  log_every_n_steps: null
  enable_checkpointing: null
  enable_progress_bar: null
  enable_model_summary: null
  accumulate_grad_batches: 1
  gradient_clip_val: null
  gradient_clip_algorithm: null
  deterministic: null
  benchmark: null
  inference_mode: true
  use_distributed_sampler: true
  profiler: null
  detect_anomaly: false
  barebones: false
  plugins: null
  sync_batchnorm: false
  reload_dataloaders_every_n_epochs: 0
optimizer: null
lr_scheduler: null
model:
  class_path: anomalib.models.EfficientAd
  init_args:
    imagenet_dir: datasets/imagenette
    teacher_out_channels: 384
    model_size: S
    lr: 0.0001
    weight_decay: 1.0e-05
    padding: false
    pad_maps: true
data:
  class_path: anomalib.data.Folder
  init_args:
    name: sample
    normal_dir: train/good
    root: datasets/sample
    abnormal_dir: val/anomaly
    normal_test_dir: val/good
    mask_dir: null
    normal_split_ratio: 0.0
    extensions:
    - .jpg
    train_batch_size: 1
    eval_batch_size: 1
    num_workers: 0
    image_size: null
    transform: null
    train_transform:
      class_path: torchvision.transforms.v2.Compose
      init_args:
        transforms:
        - class_path: torchvision.transforms.v2.Resize
          init_args:
            size:
            - 1024
            - 1024
            interpolation: BILINEAR
            max_size: null
            antialias: true
    eval_transform:
      class_path: torchvision.transforms.v2.Compose
      init_args:
        transforms:
        - class_path: torchvision.transforms.v2.Resize
          init_args:
            size:
            - 1024
            - 1024
            interpolation: BILINEAR
            max_size: null
            antialias: true
    test_split_mode: FROM_DIR
    test_split_ratio: 0.2
    val_split_mode: same_as_test
    val_split_ratio: 0.5
    seed: null
normalization:
  normalization_method: MIN_MAX
task: classification
metrics:
  image:
  - AUROC
  - F1Score
  - F1Max
  - AUPR
  pixel: null
  threshold:
    class_path: anomalib.metrics.F1AdaptiveThreshold
    init_args:
      default_value: 0.5
      thresholds: null
      ignore_index: null
      validate_args: true
      compute_on_cpu: false
      dist_sync_on_step: false
      sync_on_compute: true
      compute_with_cache: true
logging:
  log_graph: false
default_root_dir: results
ckpt_path: null

Logs

/Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
  warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning)
WARNING:anomalib.utils.config:Anomalib currently does not support multi-gpu training. Setting devices to 1.
INFO:anomalib.models.components.base.anomaly_module:Initializing EfficientAd model.
INFO:anomalib.callbacks:Loading the callbacks
INFO:anomalib.engine.engine:Overriding num_sanity_val_steps from None with 0 for EfficientAd
GPU available: True (mps), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
WARNING:anomalib.metrics.f1_score:F1Score class exists for backwards compatibility. It will be removed in v1.1. Please use BinaryF1Score from torchmetrics instead

  | Name                  | Type                     | Params | Mode 
---------------------------------------------------------------------------
0 | model                 | EfficientAdModel         | 8.1 M  | train
1 | _transform            | Compose                  | 0      | train
2 | normalization_metrics | MetricCollection         | 0      | train
3 | image_threshold       | F1AdaptiveThreshold      | 0      | train
4 | pixel_threshold       | F1AdaptiveThreshold      | 0      | train
5 | image_metrics         | AnomalibMetricCollection | 0      | train
6 | pixel_metrics         | AnomalibMetricCollection | 0      | train
---------------------------------------------------------------------------
8.1 M     Trainable params
0         Non-trainable params
8.1 M     Total params
32.235    Total estimated model params size (MB)
47        Modules in train mode
7         Modules in eval mode
/Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=10` in the `DataLoader` to improve performance.
/Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=10` in the `DataLoader` to improve performance.
Training: |          | 0/? [00:00<?, ?it/s]INFO:anomalib.models.image.efficient_ad.lightning_model:Load pretrained teacher model from pre_trained/efficientad_pretrained_weights/pretrained_teacher_small.pth
/Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torchvision/transforms/v2/_deprecated.py:42: UserWarning: The transform `ToTensor()` is deprecated and will be removed in a future release. Instead, please use `v2.Compose([v2.ToImage(), v2.ToDtype(torch.float32, scale=True)])`.Output is equivalent up to float precision.
  warnings.warn(
Calculate teacher channel mean & std: 100%|██████████| 3830/3830 [11:10<00:00,  5.72it/s]
Epoch 0:  87%|████████▋ | 3331/3830 [2:55:32<26:17,  0.32it/s, v_num=kl_0, train_st_step=0.785, train_ae_step=0.169, train_stae_step=0.0339, train_loss_step=0.987]╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /Users/deya/PycharmProjects/Experiments_AD/anomaly_detection_anomalib/run_tr │
│ ain.py:10 in <module>                                                        │
│                                                                              │
│    7 if __name__ == "__main__":                                              │
│    8 │   """Fetch the engine, model and datamodule from the config, and trai │
│    9 │   engine, model, datamodule = Engine.from_config(CONFIG_PATH)         │
│ ❱ 10 │   engine.train(datamodule=datamodule, model=model)                    │
│   11                                                                         │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/anomalib/engi │
│ ne/engine.py:871 in train                                                    │
│                                                                              │
│    868 │   │   │   # if the model is zero-shot or few-shot, we only need to  │
│    869 │   │   │   self.trainer.validate(model, val_dataloaders, None, verbo │
│    870 │   │   else:                                                         │
│ ❱  871 │   │   │   self.trainer.fit(model, train_dataloaders, val_dataloader │
│    872 │   │   self.trainer.test(model, test_dataloaders, ckpt_path=ckpt_pat │
│    873 │                                                                     │
│    874 │   def export(                                                       │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/trainer.py:538 in fit                                           │
│                                                                              │
│    535 │   │   self.state.fn = TrainerFn.FITTING                             │
│    536 │   │   self.state.status = TrainerStatus.RUNNING                     │
│    537 │   │   self.training = True                                          │
│ ❱  538 │   │   call._call_and_handle_interrupt(                              │
│    539 │   │   │   self, self._fit_impl, model, train_dataloaders, val_datal │
│    540 │   │   )                                                             │
│    541                                                                       │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/call.py:47 in _call_and_handle_interrupt                        │
│                                                                              │
│    44 │   try:                                                               │
│    45 │   │   if trainer.strategy.launcher is not None:                      │
│    46 │   │   │   return trainer.strategy.launcher.launch(trainer_fn, *args, │
│ ❱  47 │   │   return trainer_fn(*args, **kwargs)                             │
│    48 │                                                                      │
│    49 │   except _TunerExitException:                                        │
│    50 │   │   _call_teardown_hook(trainer)                                   │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/trainer.py:574 in _fit_impl                                     │
│                                                                              │
│    571 │   │   │   model_provided=True,                                      │
│    572 │   │   │   model_connected=self.lightning_module is not None,        │
│    573 │   │   )                                                             │
│ ❱  574 │   │   self._run(model, ckpt_path=ckpt_path)                         │
│    575 │   │                                                                 │
│    576 │   │   assert self.state.stopped                                     │
│    577 │   │   self.training = False                                         │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/trainer.py:981 in _run                                          │
│                                                                              │
│    978 │   │   # ----------------------------                                │
│    979 │   │   # RUN THE TRAINER                                             │
│    980 │   │   # ----------------------------                                │
│ ❱  981 │   │   results = self._run_stage()                                   │
│    982 │   │                                                                 │
│    983 │   │   # ----------------------------                                │
│    984 │   │   # POST-Training CLEAN UP                                      │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/trainer.py:1025 in _run_stage                                   │
│                                                                              │
│   1022 │   │   │   with isolate_rng():                                       │
│   1023 │   │   │   │   self._run_sanity_check()                              │
│   1024 │   │   │   with torch.autograd.set_detect_anomaly(self._detect_anoma │
│ ❱ 1025 │   │   │   │   self.fit_loop.run()                                   │
│   1026 │   │   │   return None                                               │
│   1027 │   │   raise RuntimeError(f"Unexpected state {self.state}")          │
│   1028                                                                       │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/fit_loop.py:205 in run                                            │
│                                                                              │
│   202 │   │   while not self.done:                                           │
│   203 │   │   │   try:                                                       │
│   204 │   │   │   │   self.on_advance_start()                                │
│ ❱ 205 │   │   │   │   self.advance()                                         │
│   206 │   │   │   │   self.on_advance_end()                                  │
│   207 │   │   │   │   self._restarting = False                               │
│   208 │   │   │   except StopIteration:                                      │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/fit_loop.py:363 in advance                                        │
│                                                                              │
│   360 │   │   │   )                                                          │
│   361 │   │   with self.trainer.profiler.profile("run_training_epoch"):      │
│   362 │   │   │   assert self._data_fetcher is not None                      │
│ ❱ 363 │   │   │   self.epoch_loop.run(self._data_fetcher)                    │
│   364 │                                                                      │
│   365 │   def on_advance_end(self) -> None:                                  │
│   366 │   │   trainer = self.trainer                                         │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/training_epoch_loop.py:140 in run                                 │
│                                                                              │
│   137 │   │   self.on_run_start(data_fetcher)                                │
│   138 │   │   while not self.done:                                           │
│   139 │   │   │   try:                                                       │
│ ❱ 140 │   │   │   │   self.advance(data_fetcher)                             │
│   141 │   │   │   │   self.on_advance_end(data_fetcher)                      │
│   142 │   │   │   │   self._restarting = False                               │
│   143 │   │   │   except StopIteration:                                      │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/training_epoch_loop.py:250 in advance                             │
│                                                                              │
│   247 │   │   │   with trainer.profiler.profile("run_training_batch"):       │
│   248 │   │   │   │   if trainer.lightning_module.automatic_optimization:    │
│   249 │   │   │   │   │   # in automatic optimization, there can only be one │
│ ❱ 250 │   │   │   │   │   batch_output = self.automatic_optimization.run(tra │
│   251 │   │   │   │   else:                                                  │
│   252 │   │   │   │   │   batch_output = self.manual_optimization.run(kwargs │
│   253                                                                        │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/optimization/automatic.py:190 in run                              │
│                                                                              │
│   187 │   │   # ------------------------------                               │
│   188 │   │   # gradient update with accumulated gradients                   │
│   189 │   │   else:                                                          │
│ ❱ 190 │   │   │   self._optimizer_step(batch_idx, closure)                   │
│   191 │   │                                                                  │
│   192 │   │   result = closure.consume_result()                              │
│   193 │   │   if result.loss is None:                                        │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/optimization/automatic.py:268 in _optimizer_step                  │
│                                                                              │
│   265 │   │   │   self.optim_progress.optimizer.step.increment_ready()       │
│   266 │   │                                                                  │
│   267 │   │   # model hook                                                   │
│ ❱ 268 │   │   call._call_lightning_module_hook(                              │
│   269 │   │   │   trainer,                                                   │
│   270 │   │   │   "optimizer_step",                                          │
│   271 │   │   │   trainer.current_epoch,                                     │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/call.py:167 in _call_lightning_module_hook                      │
│                                                                              │
│   164 │   pl_module._current_fx_name = hook_name                             │
│   165 │                                                                      │
│   166 │   with trainer.profiler.profile(f"[LightningModule]{pl_module.__clas │
│ ❱ 167 │   │   output = fn(*args, **kwargs)                                   │
│   168 │                                                                      │
│   169 │   # restore current_fx when nested context                           │
│   170 │   pl_module._current_fx_name = prev_fx_name                          │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/core/module.py:1306 in optimizer_step                                   │
│                                                                              │
│   1303 │   │   │   │   # Add your custom logic to run directly after `optimi │
│   1304 │   │                                                                 │
│   1305 │   │   """                                                           │
│ ❱ 1306 │   │   optimizer.step(closure=optimizer_closure)                     │
│   1307 │                                                                     │
│   1308 │   def optimizer_zero_grad(self, epoch: int, batch_idx: int, optimiz │
│   1309 │   │   """Override this method to change the default behaviour of `` │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/core/optimizer.py:153 in step                                           │
│                                                                              │
│   150 │   │   │   raise MisconfigurationException("When `optimizer.step(clos │
│   151 │   │                                                                  │
│   152 │   │   assert self._strategy is not None                              │
│ ❱ 153 │   │   step_output = self._strategy.optimizer_step(self._optimizer, c │
│   154 │   │                                                                  │
│   155 │   │   self._on_after_step()                                          │
│   156                                                                        │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/strategies/strategy.py:238 in optimizer_step                            │
│                                                                              │
│   235 │   │   model = model or self.lightning_module                         │
│   236 │   │   # TODO(fabric): remove assertion once strategy's optimizer_ste │
│   237 │   │   assert isinstance(model, pl.LightningModule)                   │
│ ❱ 238 │   │   return self.precision_plugin.optimizer_step(optimizer, model=m │
│   239 │                                                                      │
│   240 │   def _setup_model_and_optimizers(self, model: Module, optimizers: L │
│   241 │   │   """Setup a model and multiple optimizers together.             │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/plugins/precision/precision.py:122 in optimizer_step                    │
│                                                                              │
│   119 │   ) -> Any:                                                          │
│   120 │   │   """Hook to run the optimizer step."""                          │
│   121 │   │   closure = partial(self._wrap_closure, model, optimizer, closur │
│ ❱ 122 │   │   return optimizer.step(closure=closure, **kwargs)               │
│   123 │                                                                      │
│   124 │   def _clip_gradients(                                               │
│   125 │   │   self,                                                          │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/optim/l │
│ r_scheduler.py:75 in wrapper                                                 │
│                                                                              │
│     72 │   │   │   │   instance = instance_ref()                             │
│     73 │   │   │   │   instance._step_count += 1                             │
│     74 │   │   │   │   wrapped = func.__get__(instance, cls)                 │
│ ❱   75 │   │   │   │   return wrapped(*args, **kwargs)                       │
│     76 │   │   │                                                             │
│     77 │   │   │   # Note that the returned function here is no longer a bou │
│     78 │   │   │   # so attributes like `__func__` and `__self__` no longer  │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/optim/o │
│ ptimizer.py:391 in wrapper                                                   │
│                                                                              │
│   388 │   │   │   │   │   │   │   │   f"{func} must return None or a tuple o │
│   389 │   │   │   │   │   │   │   )                                          │
│   390 │   │   │   │                                                          │
│ ❱ 391 │   │   │   │   out = func(*args, **kwargs)                            │
│   392 │   │   │   │   self._optimizer_step_code()                            │
│   393 │   │   │   │                                                          │
│   394 │   │   │   │   # call optimizer step post hooks                       │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/optim/o │
│ ptimizer.py:76 in _use_grad                                                  │
│                                                                              │
│    73 │   │   │   # see https://github.com/pytorch/pytorch/issues/104053     │
│    74 │   │   │   torch.set_grad_enabled(self.defaults['differentiable'])    │
│    75 │   │   │   torch._dynamo.graph_break()                                │
│ ❱  76 │   │   │   ret = func(self, *args, **kwargs)                          │
│    77 │   │   finally:                                                       │
│    78 │   │   │   torch._dynamo.graph_break()                                │
│    79 │   │   │   torch.set_grad_enabled(prev_grad)                          │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/optim/a │
│ dam.py:148 in step                                                           │
│                                                                              │
│   145 │   │   loss = None                                                    │
│   146 │   │   if closure is not None:                                        │
│   147 │   │   │   with torch.enable_grad():                                  │
│ ❱ 148 │   │   │   │   loss = closure()                                       │
│   149 │   │                                                                  │
│   150 │   │   for group in self.param_groups:                                │
│   151 │   │   │   params_with_grad = []                                      │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/plugins/precision/precision.py:108 in _wrap_closure                     │
│                                                                              │
│   105 │   │   consistent with the ``Precision`` subclasses that cannot pass  │
│   106 │   │                                                                  │
│   107 │   │   """                                                            │
│ ❱ 108 │   │   closure_result = closure()                                     │
│   109 │   │   self._after_closure(model, optimizer)                          │
│   110 │   │   return closure_result                                          │
│   111                                                                        │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/optimization/automatic.py:144 in __call__                         │
│                                                                              │
│   141 │                                                                      │
│   142 │   @override                                                          │
│   143 │   def __call__(self, *args: Any, **kwargs: Any) -> Optional[Tensor]: │
│ ❱ 144 │   │   self._result = self.closure(*args, **kwargs)                   │
│   145 │   │   return self._result.loss                                       │
│   146                                                                        │
│   147                                                                        │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/utils/_ │
│ contextlib.py:115 in decorate_context                                        │
│                                                                              │
│   112 │   @functools.wraps(func)                                             │
│   113 │   def decorate_context(*args, **kwargs):                             │
│   114 │   │   with ctx_factory():                                            │
│ ❱ 115 │   │   │   return func(*args, **kwargs)                               │
│   116 │                                                                      │
│   117 │   return decorate_context                                            │
│   118                                                                        │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/optimization/automatic.py:138 in closure                          │
│                                                                              │
│   135 │   │   │   self._zero_grad_fn()                                       │
│   136 │   │                                                                  │
│   137 │   │   if self._backward_fn is not None and step_output.closure_loss  │
│ ❱ 138 │   │   │   self._backward_fn(step_output.closure_loss)                │
│   139 │   │                                                                  │
│   140 │   │   return step_output                                             │
│   141                                                                        │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/loops/optimization/automatic.py:239 in backward_fn                      │
│                                                                              │
│   236 │   │   │   return None                                                │
│   237 │   │                                                                  │
│   238 │   │   def backward_fn(loss: Tensor) -> None:                         │
│ ❱ 239 │   │   │   call._call_strategy_hook(self.trainer, "backward", loss, o │
│   240 │   │                                                                  │
│   241 │   │   return backward_fn                                             │
│   242                                                                        │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/trainer/call.py:319 in _call_strategy_hook                              │
│                                                                              │
│   316 │   │   return None                                                    │
│   317 │                                                                      │
│   318 │   with trainer.profiler.profile(f"[Strategy]{trainer.strategy.__clas │
│ ❱ 319 │   │   output = fn(*args, **kwargs)                                   │
│   320 │                                                                      │
│   321 │   # restore current_fx when nested context                           │
│   322 │   pl_module._current_fx_name = prev_fx_name                          │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/strategies/strategy.py:212 in backward                                  │
│                                                                              │
│   209 │   │   assert self.lightning_module is not None                       │
│   210 │   │   closure_loss = self.precision_plugin.pre_backward(closure_loss │
│   211 │   │                                                                  │
│ ❱ 212 │   │   self.precision_plugin.backward(closure_loss, self.lightning_mo │
│   213 │   │                                                                  │
│   214 │   │   closure_loss = self.precision_plugin.post_backward(closure_los │
│   215 │   │   self.post_backward(closure_loss)                               │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/plugins/precision/precision.py:72 in backward                           │
│                                                                              │
│    69 │   │   │   \**kwargs: Keyword arguments for the same purpose as ``*ar │
│    70 │   │                                                                  │
│    71 │   │   """                                                            │
│ ❱  72 │   │   model.backward(tensor, *args, **kwargs)                        │
│    73 │                                                                      │
│    74 │   @override                                                          │
│    75 │   def post_backward(self, tensor: Tensor, module: "pl.LightningModul │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/lightning/pyt │
│ orch/core/module.py:1101 in backward                                         │
│                                                                              │
│   1098 │   │   if self._fabric:                                              │
│   1099 │   │   │   self._fabric.backward(loss, *args, **kwargs)              │
│   1100 │   │   else:                                                         │
│ ❱ 1101 │   │   │   loss.backward(*args, **kwargs)                            │
│   1102 │                                                                     │
│   1103 │   def toggle_optimizer(self, optimizer: Union[Optimizer, LightningO │
│   1104 │   │   """Makes sure only the gradients of the current optimizer's p │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/_tensor │
│ .py:525 in backward                                                          │
│                                                                              │
│    522 │   │   │   │   create_graph=create_graph,                            │
│    523 │   │   │   │   inputs=inputs,                                        │
│    524 │   │   │   )                                                         │
│ ❱  525 │   │   torch.autograd.backward(                                      │
│    526 │   │   │   self, gradient, retain_graph, create_graph, inputs=inputs │
│    527 │   │   )                                                             │
│    528                                                                       │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/autogra │
│ d/__init__.py:267 in backward                                                │
│                                                                              │
│   264 │   # The reason we repeat the same comment below is that              │
│   265 │   # some Python versions print out the first line of a multi-line fu │
│   266 │   # calls in the traceback and some print out the last line          │
│ ❱ 267 │   _engine_run_backward(                                              │
│   268 │   │   tensors,                                                       │
│   269 │   │   grad_tensors_,                                                 │
│   270 │   │   retain_graph,                                                  │
│                                                                              │
│ /Users/deya/miniconda3/envs/py310/lib/python3.10/site-packages/torch/autogra │
│ d/graph.py:744 in _engine_run_backward                                       │
│                                                                              │
│   741 │   if attach_logging_hooks:                                           │
│   742 │   │   unregister_hooks = _register_logging_hooks_on_whole_graph(t_ou │
│   743 │   try:                                                               │
│ ❱ 744 │   │   return Variable._execution_engine.run_backward(  # Calls into  │
│   745 │   │   │   t_outputs, *args, **kwargs                                 │
│   746 │   │   )  # Calls into the C++ engine to run the backward pass        │
│   747 │   finally:                                                           │
╰──────────────────────────────────────────────────────────────────────────────╯
RuntimeError: shape mismatch: value tensor of shape [16778] cannot be broadcast 
to indexing result of shape [17823]

Process finished with exit code 1

Code of Conduct

I agree to follow this project's Code of Conduct

Any help would be appreciated, as I am stuck in this issue for a long time.
The issue persists also in NVIDIA GPU.

Answered by kdirkjan

Apr 3, 2025

Actually, dimensions must be integer when divided by 64.
Thus intead of "[4096, 2160].", it would be better to use [4096, 2176].

In my own code I also adapt the auto encoder "class Encoder(nn.Module):" in "models/image/efficient_ad/torch_model.py".
If you use larger images and do not adapt the auto-encoder; depth or the kernel size of the final layer, it will not see the whole image at once.

View full answer

kdirkjan · 2025-04-03T09:13:37Z

kdirkjan
Apr 3, 2025

Actually, dimensions must be integer when divided by 64.
Thus intead of "[4096, 2160].", it would be better to use [4096, 2176].

In my own code I also adapt the auto encoder "class Encoder(nn.Module):" in "models/image/efficient_ad/torch_model.py".
If you use larger images and do not adapt the auto-encoder; depth or the kernel size of the final layer, it will not see the whole image at once.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Shape mismatch: EfficientAD #2769

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Bug]: Shape mismatch: EfficientAD #2769

Uh oh!

Uh oh!

deyantara1303 Mar 27, 2025

Describe the bug

Dataset

Model

Steps to reproduce the behaviour:

OS information

Expected behavior

Screenshots

Pip/GitHub

What version/branch did you use?

Configuration YAML

Logs

Code of Conduct

Replies: 1 comment

Uh oh!

kdirkjan Apr 3, 2025

deyantara1303
Mar 27, 2025

kdirkjan
Apr 3, 2025