Many thanks for the great work!
I was looking in the code and I found the optimizer (BnB 8bit) is using FP32 parameter, FP32 gradient, UINT8 1st moment and UINT8 2nd moment. I was wondering where the FP8 dtype is applied to in the optimizer pipeline?
Also, could you provide the information of block size used for the quantizing the optimizer states?
Kind regards,