Destiny_Knight t1_j9iupzk wrote
Reply to comment by dangeratio in A German AI startup just might have a GPT-4 competitor this year. It is 300 billion parameters model by Dr_Singularity
What the actual fuck is that paper? The thing performed better than a human at several different question classes.
At fucking less than one billion parameters. 100x less than GPT 3.5.
Edit: For clarity, I am impressed not angry lol.
IluvBsissa t1_j9j5t08 wrote
Are you angry or impressed ?
Destiny_Knight t1_j9j6iq0 wrote
impressed lol
IluvBsissa t1_j9j6v5v wrote
If these models are so smol and efficient, why are they not released ?? I just don't get it. I thought PaLM was kept private because it was too costly to run to be profitable...
kermunnist t1_j9kqsaw wrote
That's because the smaller models are less useful. With neural networks (likely including biological ones) there's a hard trade off between specialized performance and general performance. If these 100+x smaller models were trained on the same data as GPT-3 they would perform 100+x worse on these metrics (maybe not exactly because in this case the model was multimodal which definitely gave a performance advantage). The big reason this model performed so much is because it was fine tuned on problems similar to the ones on this exam where as GPT-3 was fine turned on anything and everything. This means that this model would likely not be a great conversationalist and would probably flounder at most other tasks GPT-3.5 does well on.
Viewing a single comment thread. View all comments