Describe the bug
The phenomenon is as follows:
vsa

When using VSA, each step takes 11.91 seconds.
Fa

When using Flash attention, each step takes 9.39 seconds.
thank you very much, Looking forward to your reply
Reproduction
examples/training/finetune/Wan2.1-VSA/Wan-Syn-Data/T2V-14B-VSA.slurm ,
Only the environment variable FASTVIDEO_ATTENTION_BACKEND was modified.
Environment
GPU: L40s
cuda: 12.8
Driver Version: 535.230.02