Viewing a single comment thread. View all comments

ObjectManagerManager t1_ivhzsot wrote

There are diminishing returns on data. It's difficult to get truly new data when you already have billions of data points, and it's difficult to improve a model when it's already very good.

So, like Moore's law, it'll probably slow down eventually. At that point, most significant developments will be a result of improving model efficiency rather than just making them bigger.

Not to mention, models are made more efficient all the time. Sure, DALL-E-2 is huge. But first off, it's smaller than DALL-E. And second, if you compare a model of a fixed size today to a model of the same size just a couple of years ago, today's model will still win out by a significant margin. Heck, you can definitely train a decent ImageNet1K model on a hobby ML PC (e.g., an RTX graphics card, or even something cheaper if you have enough days to spare on a small learning rate and batch size). And inference takes much less time / memory than training since you can usually fix the batch size to 1 and you don't have to store a computational graph for a backward pass. A decade ago, this would have been much more difficult.

1