Branch/Tag/Commit
main
Docker Image Version
n/a
GPU name
n/a
CUDA Driver
n/a
Reproduced Steps
https://github.com/NVIDIA/FasterTransformer/blob/c6e8f60ec40da218804a60e6aa986903e7fa8594/src/fastertransformer/layers/attention_layers/GptContextAttentionLayer.cc#L568-L582