Submitted by Technologenesis t3_125wzvw in singularity

Llama proved that GPT-3-3.5 level performance can be squeezed out of relatively wimpy consumer hardware. But GPT-4 is much bigger than GPT-3, so it seems like even optimizing it by orders of magnitude might not be enough to achieve similar results. Is it plausible to expect GPT-4 level performance from consumer hardware in the near future?

3

Comments

You must log in or register to comment.

UseNew5079 t1_je6lgyb wrote

Maybe 7b model can get GPT-4 level performance if trained for _very_ long. Facebook paper showed that performance increased until the end of training and it looks like there was no plateau. Maybe it's just very inefficient but possible? Or maybe there is another way.

2

Akimbo333 t1_je9proo wrote

Why does performance increase with training instead of parameters?

1

Scarlet_pot2 t1_je9aq1r wrote

let's find out. train a small model and fine-tune it on gpt-3 / 3.5 / 4

1