Viewing a single comment thread. View all comments

ElvinRath t1_j56gy3g wrote

Either we need new architectures or more data.

​

Right now, even if somehow we put youtube into text, wich could be done, there is just not enought data to efficiently train a 1T parameters model.

And just in text form, there is probably not enought even for 300B....

​

So, yeah, there is no enought data. to keep scaling up

​

It Might be different with multimodal, I don't know about that.

7