yaosio t1_j76vwr2 wrote on February 4, 2023 at 3:17 PM

I think it's likely the ability to determine what is true and what isn't will come from a capability of the model rather than it being told what is and isn't true. It's not possible to mark text as true or not true as this assumes whomever is mafking these things is the sole authority on the truth and never makes mistakes.

At a certain level of capability the AI will be able to use all of its knowledge to determine what is and isn't true. For example, if you know enough about physics and the Earth, you'll know that the sky is blue without seeing it. For something that can't be confirmed or denied, such as, "Bob puts his shoes on before his pants." The AI could determine the likelihood of such an action based on what it knows about Bob, pants, and shoes.

If it's trained on lies it could determine they are lies because the data is not consistent. If I train you that every number plus another number is a number, but 2+2 is special and equals chair, you could determine I'm lying because it's not consistent with all the data as a whole.

Truth has a consistency to it that lies don't have, and a model can learn that.

ThirdMover t1_j77bf6z wrote on February 4, 2023 at 5:04 PM

> I think it's likely the ability to determine what is true and what isn't will come from a capability of the model rather than it being told what is and isn't true. It's not possible to mark text as true or not true as this assumes whomever is mafking these things is the sole authority on the truth and never makes mistakes.

I think there is a bit of a misunderstanding here. The issue isn't that GPT3 has wrong opinions about stuff. The issue is that it doesn't have any opinions about what is real or isn't whatsoever. Of course any future AI will operate on limited and flawed information and thus have opinions that are not perfectly true. But before we can even get to that point a model needs to even have the idea of "real" and "not real" as fundamental categories. For GPT3 everything is just text, Harry Potter is as real as Obama. Maybe I am wrong and inference can actually get you there through pure consistency checks, as you say. But we will have to see about that.

42gauge t1_j7e9mb2 wrote on February 6, 2023 at 3:44 AM

> If I train you that every number plus another number is a number, but 2+2 is special and equals chair, you could determine I'm lying because it's not consistent with all the data as a whole.

If I train you that every animal isn't conscious, but humans are special and conscious, you could "determine" I'm lying because it's not consistent with all the data as a whole.

Alarming_Turnover578 t1_j8poufw wrote on February 16, 2023 at 1:30 AM

According to Cambridge Declaration on Consciousness that would be correct. Unique property of Homo Sapiens mind is sapience not consciousness or sentience.

42gauge t1_j8pzroz wrote on February 16, 2023 at 2:52 AM

Fine, just mentally replace both instances of "conscious" with "sapient"

[deleted] t1_j9lopbv wrote on February 22, 2023 at 9:43 PM

[deleted]

ipoppo t1_j77l1hr wrote on February 4, 2023 at 6:08 PM

Taking from Judea Pearl's book, capability of coming up with useful counterfactuals and causalities will likely built upon foundation of having good assumption about "world model(s)"

[deleted] t1_j766r6p wrote on February 4, 2023 at 11:09 AM

[deleted]

Dr_Love2-14 t1_j7aqm6x wrote on February 5, 2023 at 11:27 AM

During model training, I imagine the model would benefit from some form of "self-reflection" at recurrent intervals, similar to human sleep. For a crude workflow, one could design the model to recall through auto-prompting onto a context window everything its learned that is relevant to the newly exposed training data, and then the model makes a rationale decision (following a constant pre-encoded prompt) to restate the information and classify it as factual or non-factual, and then this self-generated text is backpropagated to the model.

(Disclaimer: I follow ML research as a layman)

[R] Multimodal Chain-of-Thought Reasoning in Language Models - Amazon Web Services Zhuosheng Zhang et al - Outperforms GPT-3.5 by 16% (75%->91%) and surpasses human performance on ScienceQA while having less than 1B params!

throwaway2676 t1_j74iilz wrote on February 4, 2023 at 12:30 AM

ThirdMover t1_j760ojx wrote on February 4, 2023 at 9:39 AM