manOnPavementWaving t1_itsz25o wrote
Reply to comment by porcenat_k in Where does the model accuracy increase due to increasing the model's parameters stop? Is AGI possible by just scaling models with the current transformer architecture? by elonmusk12345_
Wowowow you're seriously questioning the scaling laws of deepmind and going back to the OpenAI ones, which have been demonstrated to be false?
Chain of thought prompting, self consistency, reinforcement learning from human feedback, and data scaling, that's been driving LLM performance lately, noticeably more than scale has. (whilst being significantly cheaper).
Why do you expect such a jump when the industry has been stuck at half a trillion for the past year? All previous jumps were smaller and cost significantly less.
porcenat_k t1_itt4w3g wrote
>Why do you expect such a jump when the industry has been stuck at half a trillion for the past year? All previous jumps were smaller and cost significantly less.
A combination of software and hardware improvements being currently worked on using Nvidia GPUs. https://azure.microsoft.com/en-us/blog/azure-empowers-easytouse-highperformance-and-hyperscale-model-training-using-deepspeed/
With regard to Chinchilla, I don't think they disproved anything. See my comment history if you care enough. I've debated quite extensively on this topic.
manOnPavementWaving t1_itt8bt1 wrote
All I see is comparisons to humans that are by and large unfounded.
justowen4 t1_itt5mpf wrote
It’s simply going to be both scenarios in 2023, quantity and quality, synthetic data variations from existing corpuses with better training distributions (pseudo-sparcity) on optimized hardware. Maybe even some novel chips like photon or analog later next year. It’s like cpus 20 years ago, optimizations all around!
Viewing a single comment thread. View all comments