blazejd OP t1_ix8k4wf wrote on November 21, 2022 at 4:03 PM

>Sure, but the answer remains: what reward function do you use that encompasses understanding and communicating, on top of grammar?

I realize this doesn't directly answer your question, so might point is that we don't know the answer, but we should at least try to pursue it.