Comments

You must log in or register to comment.

Dr_Singularity OP t1_iwj0ct4 wrote

It Delivers Near Perfect Linear Scaling for Large Language Models

25

Rakshear t1_iwjgzs8 wrote

Wtf? This is freaking awesome, we might actually see a 2030 date for the beginning.

21

ihateshadylandlords t1_iwjh7dr wrote

So what are the implications of this? From what I could tell from the article, it looks like it trains LLMs faster.

22

visarga t1_iwkbncq wrote

One Cerebras chip is about 100 top GPUs in speed but in memory it only handles 20B weights, they mention GPT-NeoX 20B. They need to stack 10 of these to train GPT-3.

8