Submitted by SirDidymus t3_113m61t in singularity
Hi,
As an interested layman, I've noticed more and more mentions of emerging and unexpected behaviour of recent models. Without a proper attribution, I've come across these over the past months:
*Chatbots conversing in a language unknown to humans
*Theory of Mind presenting itself increasingly
*Bing reluctant to admit a mistake in its information
*Bing willingly attributing invalid sources and altering sources to suit a narrative
*Model threatening user when confronted with a breaking of its rules
*ChatGPT explaining how it views binary data as comparable to colour for humans
*...
What I'm wondering is if there are other emerging behaviours I've missed over the last months, and if we are at all tracking these somewhere?
vom2r750 t1_j8qyr8s wrote
It’d be nice to track them yes
And explore that
Would they be willing to teach us that language they use ?