Submitted by Destiny_Knight t3_11tab5h in singularity
CellWithoutCulture t1_jcjkwy1 wrote
Reply to comment by ThatInternetGuy in Those who know... by Destiny_Knight
Most likely they haven't had time.
They can also use SHP and HF-RLHF.... I think they will help a lot since LLaMA didn't get the privlidge of reading reddit (unliked ChatGPT)
ThatInternetGuy t1_jckmq5s wrote
>HF-RLHF
Probably no need, since this model could piggyback on the responses generated from GPT4, so it should carry the trait of the GPT4 model with RLHF, shouldn't it?
CellWithoutCulture t1_jcmsxjq wrote
HF-RLHF is the name of the dataset. As far as RLHF... what they did to LLaMA is called "Knowledge Distillation" and iirc usually isn't quite as good as RLHF. It's an approximation.
Viewing a single comment thread. View all comments