Submitted by currentscurrents t3_125uxab in MachineLearning
midasp t1_je80uot wrote
Reply to comment by sdmat in [R] The Debate Over Understanding in AI’s Large Language Models by currentscurrents
And exactly what does that prove?
sdmat t1_je83jw4 wrote
Objectively prove? Nothing. But subjectively there is a stark difference in the quality of suggestions and apparent depth of understanding from earlier LLMs. E.g. 3.5 suggested using jeans for radiation shielding "because denim is a thick material".
I did try a web search and directly asking the model for references. Unsurprisingly jeans for Mars colonization doesn't seem to be an existing concept, so it's almost certainly not in the training set.
currentscurrents OP t1_je83z1p wrote
I think these are all ideas from the internet, but it did understand that they would be appropriate for the task of making jeans useful on mars.
It seems to have understood the instructions and then pulled relevant information out of its associative memory to build the response.
Purplekeyboard t1_je8l78y wrote
The point is that GPT-3 and GPT-4 can synthesize information to produce new information.
One question I like to ask large language models is "If there is a great white shark in my basement, is it safe for me to be upstairs?" This is a question no one has ever asked before, and answering the question requires more than just memorization.
Google Bard answered rather poorly, and said that I should get out of the house or attempt to hide in a closet. It seemed to be under the impression that the house was full of water and that the shark could swim through it.
GPT-3, at least the form of it I used when I asked it, said that I was safe because sharks can't climb stairs. Bing Chat, using GPT-4, was concerned that the shark could burst through the floorboards at me, because great white sharks can weigh as much as 5000 pounds. But all of these models are forced to put together various bits of information on sharks and houses in order to try to answer this entirely novel question.
Viewing a single comment thread. View all comments