[R] UL2: Unifying Language Learning Paradigms - Google Research 2022 - 20B parameters outperforming 175B GTP-3 and tripling the performance of T5-XXl on one-shot summarization. Public checkpoints! Submitted by Singularian2501 t3_y4tp4b on October 15, 2022 at 5:31 PM in MachineLearning 14 comments 190
visarga t1_isij2xr wrote on October 16, 2022 at 6:36 AM I'm wondering what is the minimum hardware to run this model, is this really the portable alternative of GPT-3? Permalink 10 cwhaley112 t1_ispnv6f wrote on October 17, 2022 at 7:40 PM If you mean gpu, then 20B parameters * 2 bytes (assuming fp16) = 40GB VRAM. Permalink Parent 4
cwhaley112 t1_ispnv6f wrote on October 17, 2022 at 7:40 PM If you mean gpu, then 20B parameters * 2 bytes (assuming fp16) = 40GB VRAM. Permalink Parent 4
Viewing a single comment thread. View all comments