_der_erlkonig_
_der_erlkonig_ t1_j3pahbt wrote
Reply to comment by [deleted] in [R] Diffusion language models by benanne
Yes, it's mentioned in the post
_der_erlkonig_ t1_iz3k920 wrote
Reply to comment by Ulfgardleo in [R] The Forward-Forward Algorithm: Some Preliminary Investigations [Geoffrey Hinton] by shitboots
Out of curiosity, why do you include this as a requirement for an algorithm to be good/interesting/useful/etc?
_der_erlkonig_ t1_ivb0pya wrote
Reply to [R] Reincarnating Reinforcement Learning (NeurIPS 2022) - Google Brain by smallest_meta_review
Not to be that guy, but it kind of seems like this is just finally acknowledging that distillation is a good idea for RL too. They even use the teacher student terminology. Distilling a teacher to a student with a different architecture is something they make a big deal about in the paper, but people have been doing this for years in supervised learning. It's neat and important work, but the RRL branding is obnoxious and unnecessary IMO.
From a scientific standpoint, I think this methodology is also less useful than the authors advertise. Differently from supervised learning, RL is infamously sensitive to initial conditions, and adding another huge variable like the exact form of distillation used (which may reduce compute used) will make it even more difficult to isolate the source of "gains" in RL research.
_der_erlkonig_ t1_ja74633 wrote
Reply to comment by walk-the-rock in [R] Large language models generate functional protein sequences across diverse families by MysteryInc152
Socher's been gone from Salesforce for years