Viewing a single comment thread. View all comments

MrSheevPalpatine t1_j8rrjrw wrote

Given the training data that these have been built upon is coming from humans I don't find it particularly surprising that these models have been found to display characteristics that are commonly found with humans.

These models are language models, it's not that surprising to me that they would inevitably generate their own languages.
There is undoubtedly information about a "Theory of Mind" in the training data for them.
Humans are also notorious for not admitting mistakes, so not that surprising given its training data.
Humans also willingly use invalid sources and alter sources to suit their own narratives, for example just go read around Reddit for like 5 minutes.
Humans also threaten users when confronted with their own rule-breaking and mistakes.
Idk about the last one.

5