Viewing a single comment thread. View all comments

Surur t1_jaem8nr wrote on February 28, 2023 at 9:48 PM

It is interesting to me that

a) its possible to teach a LLM to be honest when we catch it in a lie.

b) if we ever get to the point where we can not detect a lie (eg. novel information) the AI is incentivised to lie every time.