Lengador t1_j74ro7q wrote on February 4, 2023 at 1:42 AM

That's the number in the headline, but if you look at the tables you can see their 223M parameter model beats the 175B parameter model significantly as well. That's 0.1% the size! Absolutely insane.

HeyLittleTrain t1_j77w36w wrote on February 4, 2023 at 7:23 PM

At what size could I run a model on a decent gaming PC?

emotionalfool123 t1_j78bjj2 wrote on February 4, 2023 at 9:13 PM

Stable diffusion is around 866M params which can be run on 12gb 3080

7734128 t1_j9j9r06 wrote on February 22, 2023 at 11:03 AM

And on my 8 GB GTX 1080.

Lengador t1_j78ovy2 wrote on February 4, 2023 at 10:51 PM

You can (just) run a 1B parameter model on a good gaming rig.

i2mi t1_j786bu0 wrote on February 4, 2023 at 8:35 PM

Around 2M Edit: the number I gave is completely delusional. Sorry

HeyLittleTrain t1_j7avkil wrote on February 5, 2023 at 12:32 PM

Your answer seems substantially different than the others.

NapkinsOnMyAnkle t1_j9jtolb wrote on February 22, 2023 at 2:12 PM

I've trained 100m CNNs on my laptop 3070 6gb. So...

[deleted] t1_j77e3ku wrote on February 4, 2023 at 5:22 PM

[deleted]

JClub t1_jabyi76 wrote on February 28, 2023 at 9:30 AM

GPT was never trained with image data, why is this a fair comparison? The UnifiedQA model is from 2022, so it doesn't seem fair either. Why don't we have some comparisons with other SOTA multimodal models? Such as OFA or UniT

[R] Multimodal Chain-of-Thought Reasoning in Language Models - Amazon Web Services Zhuosheng Zhang et al - Outperforms GPT-3.5 by 16% (75%->91%) and surpasses human performance on ScienceQA while having less than 1B params!

AiChip t1_j74ku5a wrote on February 4, 2023 at 12:48 AM