visarga t1_it6nwso wrote on October 21, 2022 at 9:36 AM

Reply to comment by ftc1234 in If you believe you can think exponentially, you might be wrong. Transformative AI is here, and it is going to radically change the world before the Singularity, and before AGI. by AdditionalPizza

> But can it reason by itself without seeing pattern ahead of time? Can it distinguish between the quality of the results it generates? Can it have an opinion that’s not in the mean of the output probability distribution?

Yes, it's only gradually ramping up, but there is a concept of learning from verification. For example AlphaGo learned from self play, but it was trivial to verify who won the game. In math it is possible to plug the solution back to verify it, in code it is possible to run it or apply test driven feedback, with robotics it is possible to run sims and learn from outcomes.

When you move to purely textual tasks it becomes more complicated, but there are approaches. For example if you have a collection of problems (multi-step, complex ones) and their answers, you can train a model to generate intermediate steps and supporting facts. Then you use these intermediate data to generate the answer, an answer you can verify. This trains a model to discover on its own the step by step solutions and solve new problems.

Another approach is to use models to curate the training data. For example LAION-400M is a dataset curated from noisy text-image pairs by generating alternative captions and then picking the best - either the original one or one of the generated captions. So we use the model to increase our training data, that will boost future models in places out of distribution.

So it's all about being creative but then verifying somehow and using the signal to train.

ftc1234 t1_it7ak7b wrote on October 21, 2022 at 1:31 PM

I think you understand the limitations of the approaches that you’ve discussed. Generating intermediate results and trying out possibilities of outcomes is not reasoning. It’s akin to a monte carlo simulation. We do such reasoning every day (eg. Is there time to eat breakfast or do you have to run to office for the meeting, do you call the plumber this week or do you wait till next month for the full paycheck, etc). LLMs are just repeating patterns and that can only take you so far.

visarga t1_it8o018 wrote on October 21, 2022 at 7:05 PM

> Generating intermediate results and trying out possibilities of outcomes is not reasoning.

Could be. People are doing something similar when faced with a novel problem. It doesn't count if you've memorised the best action from previous experience.