Viewing a single comment thread. View all comments

londons_explorer t1_j6al3tb wrote

>They were not able to find significant improvements with scaling anymore.

GPT-3 has a window size of 2048 tokens ChatGPT has a window size of 8192 tokens. The compute cost is superliner, so I suspect the compute required for ChatGPT is a minimum of 10x what GPT-3 used. And GPT-3 cost ~12M USD. (At market rates - I assume they got a deep discount)

So I suspect they did scale compute as much as they could afford.

4