transpose?

Why do you need transpose here
        _s, state_word, _ = word_attn_model(mini_batch[i,:,:].transpose(0,1), state_word)

and here:
torch.from_numpy(main_matrix).transpose(0,1) in def pad_batch

Thanks :)