blazejd OP t1_ix8k4wf wrote
Reply to comment by blazejd in [D] Why do we train language models with next word prediction instead of some kind of reinforcement learning-like setup? by blazejd
>Sure, but the answer remains: what reward function do you use that encompasses understanding and communicating, on top of grammar?
I realize this doesn't directly answer your question, so might point is that we don't know the answer, but we should at least try to pursue it.
Viewing a single comment thread. View all comments