Viewing a single comment thread. View all comments

SoylentRox t1_j2f9pw8 wrote

I think the issue is the cerebras has only 40 gigabytes of SRAM.

Palm is 540 billion parameters - that's 2.160 terabytes in just weights.

To train it you need more memory than that, think I read it's a factor of 3*. So you need 6 terabytes of memory.

This would be either ~75 A100 80 GB GPUs, or I dunno how you do it with a cerebras. Presumably you need 150 of them.

Sure it might train the whole model in hours though, cerebras has the advantage of being much faster.

Speed matters, once AI wars get really serious this might be worth every penny.

5