Viewing a single comment thread. View all comments

Tavrin t1_j9vl37m wrote

Flan-Palm is 540B so there's that

10

maskedpaki t1_j9z7sxs wrote

yes!. the really big breakthrough here is that its on par with the original gpt3 at only 7 billion parameters on a bunch of benchmarks ive seen.

​

that means its gotten 25x more efficient in the last 3 years.

I wonder how efficient these things can get. Like are we going to see a model thats 280 million parameters that rivals original gpt3 in 2026 and a 11 million parameter one in 2029.

3