-
Notifications
You must be signed in to change notification settings - Fork 212
Open
Description
Describe the bug
Hi devs,
Thanks for your great work! I'd like to propose a request regarding the design of sequence parallelism in the distillation pipeline. As shown in distillation_pipeline:
if self.sp_world_size > 1:
noise = rearrange(noise,
"b (n t) c h w -> b n t c h w",
n=self.sp_world_size).contiguous()
noise = noise[:, self.rank_in_sp_group, :, :, :, :]This requires frame_num to be divisible by sp_world_size. However, sp_world_size is often equal to the number of GPUs, and frame_num is typically 21 in the Self Forcing framework, which could cause errors in this section.
I think a more practical approach would be to divide the token_length (i.e., t*h*w//(vae_stride**2)) instead of frame_num.
Reproduction
bash distill/SFWan2.1-T2V/distill_dmd_t2v_1.3B.sh
Environment
8*GPU
Linux
Metadata
Metadata
Assignees
Labels
No labels