[Bug] Wan2.1-14B-I2V VSA acceleration is not satisfactory

### Describe the bug

I attempted to use VSA in Wan2.1-14B-I2V-720P with a high sparsification rate of 0.95. However, during the inference process, H20 inference only accelerated by 3 times, while H100 inference accelerated by 2 times. Is this normal?

### Reproduction

models: Wan2.1-14B-I2V-720P


### Environment

pytorch==2.8.0
vsa==0.03