ParmesanCharmeleon

ParmesanCharmeleon t1_j8pkmgi wrote on February 16, 2023 at 12:58 AM

Reply to Reinforcement Learning based algorithms specifically for NLP[D][P] by Smooth-Stick-5751

There is a paper from UW NLP that published the library RL4LMs and NLPO