cwhaley112

cwhaley112 t1_ispnv6f wrote on October 17, 2022 at 7:40 PM

Reply to comment by visarga in [R] UL2: Unifying Language Learning Paradigms - Google Research 2022 - 20B parameters outperforming 175B GTP-3 and tripling the performance of T5-XXl on one-shot summarization. Public checkpoints! by Singularian2501

If you mean gpu, then 20B parameters * 2 bytes (assuming fp16) = 40GB VRAM.