Destiny_Knight t1_j9iupzk wrote on February 22, 2023 at 7:38 AM

Reply to comment by dangeratio in A German AI startup just might have a GPT-4 competitor this year. It is 300 billion parameters model by Dr_Singularity

What the actual fuck is that paper? The thing performed better than a human at several different question classes.

At fucking less than one billion parameters. 100x less than GPT 3.5.

Edit: For clarity, I am impressed not angry lol.

IluvBsissa t1_j9j5t08 wrote on February 22, 2023 at 10:10 AM

Are you angry or impressed ?

Destiny_Knight t1_j9j6iq0 wrote on February 22, 2023 at 10:20 AM

impressed lol

IluvBsissa t1_j9j6v5v wrote on February 22, 2023 at 10:24 AM

If these models are so smol and efficient, why are they not released ?? I just don't get it. I thought PaLM was kept private because it was too costly to run to be profitable...

kermunnist t1_j9kqsaw wrote on February 22, 2023 at 6:17 PM

That's because the smaller models are less useful. With neural networks (likely including biological ones) there's a hard trade off between specialized performance and general performance. If these 100+x smaller models were trained on the same data as GPT-3 they would perform 100+x worse on these metrics (maybe not exactly because in this case the model was multimodal which definitely gave a performance advantage). The big reason this model performed so much is because it was fine tuned on problems similar to the ones on this exam where as GPT-3 was fine turned on anything and everything. This means that this model would likely not be a great conversationalist and would probably flounder at most other tasks GPT-3.5 does well on.