Submitted by JClub t3_10fh79i in MachineLearning
JClub OP t1_j57rrn6 wrote
Reply to comment by Ouitos in [R] A simple explanation of Reinforcement Learning from Human Feedback (RLHF) by JClub
ah yes, you're right. I actually don't know why, but you can check the implementation and ask it on GitHub
Viewing a single comment thread. View all comments