Viewing a single comment thread. View all comments

lughnasadh OP t1_ivzkees wrote

Submission Statement

Here's the rumor statement from Sam Altman, CEO of OpenAI.

It's worth noting The Turing Test is considered obsolete. It only requires an AI to appear to be intelligent enough to fool a human. In some instances, GPT-3 already does that with some of the more credulous sections of the population.

The Winograd Schema Challenge is regarded as a much better test of true intelligence. It will require genuine reasoning ability from an AI. The answer won't be available from scanning the contents of the internet and applying statistical methods that frequently correlate with what a truly intelligent, independently reasoned answer to a question is.

In any case, if the leap to GPT-4 is as great as the one from GPT-2 to GPT-3 was, we can expect even more human-like intelligence from AI.

152

NikoKun t1_iw04ud9 wrote

> It's worth noting The Turing Test is considered obsolete. It only requires an AI to appear to be intelligent enough to fool a human. In some instances, GPT-3 already does that with some of the more credulous sections of the population.

That depends more on the human, the specifications of said Turing Test, and how thoroughly it's performed. What would be the point of conducting a Turing Test using a "credulous" interviewer? lol

If we're talking about an extended-length test, conducted by multiple experts who understand the concepts and are driven to figure out which participant is AI.. I don't think GPT-3 could pass such a test, at least not for more than a few minutes, at best.. heh

58

Reddituser45005 t1_iw0o0t5 wrote

The Turing Test was developed in the 1950’s. I suspect Alan Turing would be amazed by the progress of modern computers. He certainly never imagined a machine having access to a world wide library of the collected works of humanity. His test idea was a conversation between an evaluator and two other participants- one a machine and one a human. The evaluators job is to determine the human from the machine. By modern standards, that can be done. We’ve all heard of the Google engineer who believed his AI was conscious. The challenge now is to determine what constitutes “understanding”. AI’s can create art, engage in conversation, solve problems, manage massive amounts of information, and are increasingly challenging our ideas of what constitutes intelligence.

39

Fun-Requirement9728 t1_iw31urt wrote

Is it an actual "test" or a theoretical test concept? I was under the impression it was just the idea of a test for testing AI but not like there is a specific set of questions.

5

Eli-Thail t1_iw204eg wrote

>His test idea was a conversation between an evaluator and two other participants- one a machine and one a human. The evaluators job is to determine the human from the machine. By modern standards, that can be done.

An easy way to tell the difference is to ask the exact same question twice. Particularly one that requires a length answer.

The AI will attempt to answer again, but no matter how convincing or consistent it's answers might be, the human will be the one that tells you to fuck off because they're not telling you their life story again.

3

MintyMissterious t1_iwg478j wrote

Using the Turing Test for this was always nonsense, as it never had anything to do with intelligence, but matching a human perception of what machines can't or won't do. And that critically includes mistakes.

Make the machine make typos, and scores go up.

There's a reason Alan Turing called it the "imitation game" and never claimed it measures intelligence.

In my eyes, it measures human credulity.

0

runswithcoyotes t1_iw17s1j wrote

We need a new Turing test. Now an AI will determine if it’s talking to a human.

28

Reddit_has_booba t1_iw3iaza wrote

Sorry to tell you but that's a lower standard, and exists already in bot checking and passive bot checking and has for a decade.

3

urmomaisjabbathehutt t1_iwlemw3 wrote

I was wondering if we could develop a general intelligence test that most people would fail and then someone developed an AI that passed it

1

Ducky181 t1_iw74m0s wrote

Besides just making the neural-network larger what other techniques could they employ to improve the accuracy of GPT-4 when compared to its predecessor GPT-3.

2

sext-scientist t1_iw77ylt wrote

Size is almost certainly the entire problem with these models. More recent research into how human brains process information has confirmed current generation language models have 6-9 orders of magnitude less compute than humans.

Hardware wise, hopefully 3D silicon and lower nm processes reduce the above gap in the next few years.

1

avatarname t1_ix5auxp wrote

I do wonder sometimes if our intelligence is just the question of scale of these things with some tweaking. We tend to think we are oh so imaginative and inventive and then on YouTube I discover that I have pretty much left the same comment only worded differently 13 years, 6 years back and now, on the same video that I forgot I had watched before :D

1

hellschatt t1_iwjtwgq wrote

I wrote a small seminar paper back a year or 2 ago about how to test an AIs intelligence and the paradigm shift in the testing, so I feel the urge to clarify something here.

The Winograd Schema Challenge has been passed with like a 88% or so accuracy for a few years now. Previous AIs could already kinda "pass" that one...

Neither the Turing Test, nor the Winograd Schema Challenge, are a good way of determining the general, or even only language-related intelligence of an AI. They're only showing if the AI is capable of solving a certain type of task determined by those tests. Although impressive, just because a model can understand context within language doesn't mean much in terms of its "intelligence". The argumentation of the inventors of Winograd were that being able to differentiate context would be a proof of more intelligence than just being able to fool a person in a Turing Test.

But let's say GPT-4 will pass that test with a 100% score, how do you further determine the intelligence of GPT-4 after that and newer models that all pass that test? And is the AI now intelligent just because it passed it? If you go by intuition, then you already realize that the AIs still feel more like output/input than just feeling "intelligent". It's kinda not "it".

The test doesn't make much sense anymore after thinking of this question, does it?

I still have to add though, since some researchers figured out that the Winograd Schema Challenge won't be too difficult for AIs anymore, they've tried to overcome the failure to properly measure the intelligence of an AI by simply developing even a newer more difficult version of it, also called WinoGrande. Thus the continuous paradigm shift of what is considered an "intelligent" AI...

2

Veedrac t1_iw837xo wrote

> The Winograd Schema Challenge is regarded as a much better test of true intelligence.

Good lord no! Read the paper! The Turing Test is not obsolete, you (and seemingly 99% of the population) just don't know what it is.

1