Karyo_Ten t1_iscf3fp wrote on October 14, 2022 at 10:00 PM

There is no way you are using 64-bit on the GPU.

All the CuDNN code is 32-bit for the very simple reason that non-Tesla GPUs have between 1/32 to 1/64 FP64 throughput compared to FP32.

So under the hood your FP64 stuff is converted to FP32 when sent to GPU.

And on Tesla GPUs the ratio is 1/2.

Troll_of_the_bridge OP t1_iscoee4 wrote on October 14, 2022 at 11:10 PM

I didn’t know this, thanks!