royalemate357 t1_j4migdx wrote on January 16, 2023 at 7:39 PM

Reply to comment by BeatLeJuce in [D] Tim Dettmers' GPU advice blog updated for 4000 series by init__27

TF32 is tensorfloat 32, which is a relatively new precision format for newer GPUs. Basically, when doing math, it uses the same number of mantissa as FP16 (10 bits), and the same number of exponent bits as normal float32 (8 bits). more on it here: https://blogs.nvidia.com/blog/2020/05/14/tensorfloat-32-precision-format/

BeatLeJuce t1_j4p938f wrote on January 17, 2023 at 8:13 AM

Thanks for the explanation? Why call it TF32 when it apperas to have 19 bits? (IIUC it's bfloat16 with 3 additional bits of mantissa?)

royalemate357 t1_j4qdfwj wrote on January 17, 2023 at 3:15 PM

Tbh I don't think it's an especially good name, but I believe the answer to your question is that it actually uses 32 bits to store a TF32 value in memory. its just that when they pass it into tensor cores to do matmuls, they temporarily downcast it to this 19-bit precision format.

>Dot product computation, which forms the building block for both matrix multiplies and convolutions, rounds FP32 inputs to TF32, computes the products without loss of precision, then accumulates those products into an FP32 output (Figure 1).

(from https://developer.nvidia.com/blog/accelerating-ai-training-with-tf32-tensor-cores/)