May I ask why do you omit the regression loss in DMD distillation?
May I ask why do you omit the regression loss in DMD distillation?