When I use quantized::conv2d in my model I noticed that a quantized convolution layer still keeps its scale parameter as a floating-point value. I think this scale is used to requantize the accumulated gemm output back to INT8. I would like to confirm:
Does the quantized convolution operator perform any floating-point computations internally, or is the entire operation carried out in pure INT8/INT32 arithmetic?