prototypist t1_j71p3d6 wrote on February 3, 2023 at 1:12 PM

Reply to [D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta

You can fine-tune language models on a dataset, and that's essentially how people have been typically doing NLP with transformers models? It's more recent that research has been having success with RL for these kinds of tasks. So whatever rationale and answers you get here, the main reason is that they were doing supervised learning before and the RL people started getting better results.

prototypist t1_j6ljszc wrote on January 31, 2023 at 4:23 AM

Reply to [D] What's stopping you from working on speech and voice? by jiamengial

I just barely got into text NLP when I could run notebooks with a single GPU / CoLab and get interesting outputs. I've seen some great community models (such as Dhivehi language) made with Mozilla Common Voice data. But if I were going to collect a chunk of isiXhosa transcription data, and try to run it on a single GPU, that's hours of training to an initial checkpoint which just makes some muffled noises.At end of 2022 there was a possibility to fine-tune OpenAI Whisper, so if I tried again, I might start there. https://huggingface.co/blog/fine-tune-whisper

Also I never use Siri / OK Google / Alexa. I know it's a real industry but I never think of use cases for it.

prototypist t1_j2uskwt wrote on January 4, 2023 at 2:18 AM

Reply to comment by matth0x01 in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

It's a metric comparing the model's generative probabilities / text predictions vs. the actual text.

prototypist t1_j0c5p2j wrote on December 15, 2022 at 4:09 PM

Reply to [D] Is "natural" text always maximally likely according to language models ? by Emergency_Apricot_77

There have been attempts this year at building a more human-like decoder for language models and seeing what outputs humans prefer. Transformers supports typical decoding and contrastive search, and there are papers and code out for RankGen, Time Control, and Contrastive Decoding (which is totally different from contrastive search).