ElvinRath t1_j56gy3g wrote on January 20, 2023 at 7:00 PM

Either we need new architectures or more data.

Right now, even if somehow we put youtube into text, wich could be done, there is just not enought data to efficiently train a 1T parameters model.

And just in text form, there is probably not enought even for 300B....

So, yeah, there is no enought data. to keep scaling up

It Might be different with multimodal, I don't know about that.