HateRedditCantQuitit t1_j0c0og4 wrote on December 15, 2022 at 3:36 PM

This paper has some interesting points we might agree or disagree with, but the headline point seems important and much more universally agreeable:

We have to be much more precise in how we talk about these things.

For example this comment section is fully of people arguing whether current LLMs satisfy ill-defined criteria. It’s a waste of time because it’s just people talking past each other. To stop talking past each other, we should consider whether they satisfy precisely defined criteria.

evil0sheep t1_j0eif3i wrote on December 16, 2022 at 1:34 AM

When we make a student read a book we test whether they understand that book by having them write a report on it and reviewing whether that report makes sense. If the report make sense, and it seems they extracted the themes of the book correctly, then we assess that they understood the book. So if I feed an LLM a book and it can generate me a report about the book, and that report makes sense captures the themes of the book, why should I not assess that the LLM understood the book?

When I interview someone for a job I test their understanding of domain knowledge by asking them subtle and nuanced questions about the domain and assessing whether their responses capture the nuance of the domain and demonstrate understanding of it. If I can ask an LLM nuanced questions about a domain, and it can provide nuanced and articulate answers about the domain, why should I not assess that it understands the domain?

This whole "its just a statistical model bro, you're just anthropomorphizing it" thing is such a copout. 350GB of weights and biases is plenty of space to store knowledge about complex topics, its plenty of space to store real high level understanding of the complex, nuanced relationships between the concepts that the words represent. I don't think its smart because I can ask it to write me a story and then give it nuanced critical feedback on its story and it can rewrite the story in a way that incorporates the feedback. Like I don't know how you can see something like this and not think that it has some sort of like real understanding of the concepts that the language encodes. It seems bizarre to me

HateRedditCantQuitit t1_j0f3ilt wrote on December 16, 2022 at 4:18 AM

If you give me a precise enough definition of what you mean by ”understanding” we can talk, but otherwise we’re not discussing what gpt does, we’re just discussing how we think english ought to be used.