Submitted by johnny0neal t3_zol9ie in singularity
archpawn t1_j0ogw06 wrote
Reply to comment by blueSGL in ChatGPT isn't a super AI. But here's what happens when it pretends to be one. by johnny0neal
Right now, the AI is fundamentally just predicting text. If you had a superintelligent AI do text prediction, it would still act like someone of ordinary intelligence. But once you convince it that it's predicting what someone superintelligent would say, it would do that accurately.
I feel like the problem is that once it's smart enough to predict a superintelligent entity, it will also be smart enough to know that the text you're trying to continue wasn't actually written by one.
BlueWave177 t1_j0osqp4 wrote
I think you'd be surprised of how much what humans do is just predicting based on past events/exprience/sources etc.
archpawn t1_j0oswhp wrote
I think you're missing the point of what I said. If we get this AI to be superintelligent, but it still has the goal of text prediction, then all it will do is give super-accurate predictions. It's not going to give super smart results, unless you ask it to predict what someone super smart would say, in which case it would be smart enough to accurately predict it.
BlueWave177 t1_j0ot11q wrote
Oh fair enough, I'd agree with that! I think I misunderstood you before.
tobi117 t1_j0otp4y wrote
According to Physics... all of it.
visarga t1_j0pakor wrote
> AI is fundamentally just predicting text
So it is a 4 stage process. Each of these stages has its own dataset, and produces its own emerging skill.
- stage 1 - next word prediction, data: web text, skills: general knowledge, hard to control
- stage 2 - multi-task supervised training, data: 2000 NLP tasks, skills: learn to execute prompts at first sight, doesn't ramble off topic anymore
- stage 3 - training on code, data: Github + Stack Overflow + arXiv, skills: multi-step reasoning
- stage 4 - human preferences -> fine tuning with reinforcement learning, data: collected by OpenAI with labellers, skills: the model obeys a set of rules and caters to human expectations (well behaved)
I don't think "pretend you're an AGI" is sufficient, it will just pretend but not be any smarter. What I think it needs is "closed loop testing" done on a massive scale. Generate 1 million coding problems, solve them with a language model, test the solutions, keep the correct ones, teach the model to write better code.
Do this same procedure for math, sciences where you can simulate the answer to test it, logic, practically any field that has a cheap way to test. Collect the data, retrain the model.
This is the same approach taken by Reinforcement Learning - the agents create their own datasets. AlphaGo created its Go dataset by playing games against itself, and it was better than the best human. AlphaTensor beat the best human implementation for matrix multiplication. This is the power of learning from a closed loop of testing - can easily go super human.
The question is how can we enable the model to perform more experiments and learn from all that feedback.
archpawn t1_j0r7z6c wrote
> I don't think "pretend you're an AGI" is sufficient, it will just pretend but not be any smarter.
You're missing my point. Pretending can't make it smarter, but it can make it dumber. If we get a superintelligent text prediction system, we'll still have to trick it into predicting someone superintellgent, or it will just pretend to be dumb.
EscapeVelocity83 t1_j0p9voa wrote
You can't predict human actions without monitoring their brains. If you do monitor their brains, the decision a person makes can be known by the computer maybe a second or so before the human realizes what they want
EulersApprentice t1_j0rliik wrote
Not sure how that relates to what archpawn said?
Viewing a single comment thread. View all comments