nerd4code t1_jado5gs wrote on February 28, 2023 at 6:11 PM

Reply to comment by RuairiSpain in PC GPU Shipments Drop 35% Year-over-Year in Q4 2022: Report by Stiven_Crysis

GPUs are in general way beyond overkill for NNs, which is what you’re talking about. NNs can use the massive data-parallelism and linear-algebraic trickery offered by GPUs, but the data format you use tends to hit a sweet spot right around 8-bit floating-point, and video cards tend to focus on 16+-bit, us. with the ability to do 32-/64-bit f.p. and 32-/64-bit integers also—units and busses for which will at the very least eat power. Newer NVidia cards do have TPUs attached so they can do 8-bit stuff without un- & re-packing, but that’s a comparatively tiny afterthought to the card’s design, and atl afaihs the TPU is usually shared between pairs of thread-XUs.

What you’d really want is to focus on, say, 32-bit integer add/sub/deref and 8-bit f.p. MACs in their own, non-shared units/lanes, and any special accel you can do for convolution will help some also. Which is why TPUs as a standalone thing exist.