Submitted by besabestin t3_10lp3g4 in MachineLearning
gamerx88 t1_j6cqerx wrote
It's not about large data or number of parameters. OpenAI has not actually revealed details regarding ChatGPT's architecture and training. What is special is the fine-tuning procedure -- alignment through RLHF on the underlying LLM (nicknamed GPT3.5) that is extremely good at giving "useful" responses to prompts\instructions.
Prior to this innovation, zero-shot and in-context few-shot learning with LLM was hardly working. Users had to trial and error their way to some obtuse prompt to get the LLM to generate some sensible response to their prompt, if it even worked at all. This is because LLM pre-training is purely about language structure without accounting for intent (what the human wishes to obtain via the prompt). Supervised fine-tuning based on instructions and output pairs helped but not by much. With RLHF however, the process is so effective that a mere 6B parameter model (fine-tuned with RLHF) is able to surpass a 175B parameter model. Check out the InstructGPT paper for details.
Viewing a single comment thread. View all comments