Viewing a single comment thread. View all comments

manOnPavementWaving t1_itsz25o wrote

Wowowow you're seriously questioning the scaling laws of deepmind and going back to the OpenAI ones, which have been demonstrated to be false?

Chain of thought prompting, self consistency, reinforcement learning from human feedback, and data scaling, that's been driving LLM performance lately, noticeably more than scale has. (whilst being significantly cheaper).

Why do you expect such a jump when the industry has been stuck at half a trillion for the past year? All previous jumps were smaller and cost significantly less.

8

porcenat_k t1_itt4w3g wrote

>Why do you expect such a jump when the industry has been stuck at half a trillion for the past year? All previous jumps were smaller and cost significantly less.

A combination of software and hardware improvements being currently worked on using Nvidia GPUs. https://azure.microsoft.com/en-us/blog/azure-empowers-easytouse-highperformance-and-hyperscale-model-training-using-deepspeed/

With regard to Chinchilla, I don't think they disproved anything. See my comment history if you care enough. I've debated quite extensively on this topic.

7

justowen4 t1_itt5mpf wrote

It’s simply going to be both scenarios in 2023, quantity and quality, synthetic data variations from existing corpuses with better training distributions (pseudo-sparcity) on optimized hardware. Maybe even some novel chips like photon or analog later next year. It’s like cpus 20 years ago, optimizations all around!

6