Dr_Love2-14 t1_j7aqm6x wrote on February 5, 2023 at 11:27 AM

Reply to comment by ThirdMover in [R] Multimodal Chain-of-Thought Reasoning in Language Models - Amazon Web Services Zhuosheng Zhang et al - Outperforms GPT-3.5 by 16% (75%->91%) and surpasses human performance on ScienceQA while having less than 1B params! by Singularian2501

During model training, I imagine the model would benefit from some form of "self-reflection" at recurrent intervals, similar to human sleep. For a crude workflow, one could design the model to recall through auto-prompting onto a context window everything its learned that is relevant to the newly exposed training data, and then the model makes a rationale decision (following a constant pre-encoded prompt) to restate the information and classify it as factual or non-factual, and then this self-generated text is backpropagated to the model.

(Disclaimer: I follow ML research as a layman)