Viewing a single comment thread. View all comments

genshiryoku t1_j57dtsz wrote

Without going to deep into it. This is a symptom of Transformer models. My argument was why transformer models like GPT can't scale up.

It has to do with the mathematics behind training AI. Essentially for every piece of data the AI refines itself but for copies of data it overcorrects itself which results in inefficiency or worse performance. With synthetic data it kinda acts the same as duplicate data in that it overcorrects and worsens its own performance.

If you are truly interested you can see for yourself here.

And yes AI researchers are looking for models to detect what data is synthetic on the internet because it's inevitable that new data will be machine generated which can't be used to train on. If we fail at that task we might even enter an "AI dark age" where models get worse and worse with time because the internet will be filled with AI generated garbage data that can't be trained on. Which is the worst case scenario.

4

Gohoyo t1_j57fu2a wrote

Thanks for trying to help me btw.

I watched the video. I can understand why reading it's own data wouldn't work, but I can't understand why having it create a bunch of data and then altering the data, then giving it back to the AI wouldn't. The key here is that we have machines that can create data at super human speeds. There has to be some way to do something with that data to make it useful to the AI again, right?

1

genshiryoku t1_j57h1fb wrote

The "created data" is merely the AI mixing the training data in such a way that it "creates" something new. If the dataset is big enough this looks amazing and like the AI is actually creative and creating new things but from a mathematics perspective it's still just statistically somewhere in between the data it already has trained on.

Therefor it would be the same as feeding it its own data. To us it seems like completely new, and actually useable data though which is why ChatGPT is so exciting. But for AI training purposes it's useless.

1

Gohoyo t1_j57hihv wrote

If ChatGPT creates a paragraph, I then take that paragraph and alter it significantly, how is that new never before seen by AI or humans paragraph not new data for the AI?

2

genshiryoku t1_j57j6s1 wrote

It would be lower quality data but still usable if significantly altered. The question is. Why would you do this instead of just generating real data?

GPT is trained on human language it needs real interaction to learn from like the one we're having right now.

I'm also not saying that this isn't possible. We are AGI level intelligences and we absolutely consumed less data than GPT-3 did over our lifetimes so we know it's possible to reach AGI with relatively little data.

My original argument was merely that it's impossible with current transformer models like GPT and that we need another breakthrough in AI architecture to solve problems like this, not merely scale up current transformer models, because the training data is going to run out over the next couple of years as all of the internet will be used up.

0

Gohoyo t1_j57jyq4 wrote

> Why would you do this instead of just generating real data?

The idea would be that harnessing the AI's ability to create massive amounts of regurgitated old data quickly and then transmuting it into 'new data' somehow is faster than acquiring real data.

I mean I believe you, I'm not in this field nor a genius, so if the top AI people are seeing it as a problem then I have to assume it really is, I just don't understand it fully.

1