artsybashev t1_j2suada wrote on January 3, 2023 at 6:43 PM

Reply to comment by yahma in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

A100 can run about 75B parameters in 8bit. With pruning that is doable, but it wont be quite the same perplexity.

currentscurrents t1_j2trd40 wrote on January 3, 2023 at 10:04 PM

If only it could run on a card that doesn't cost as much as a car.

I wonder if we will eventually hit a wall where more compute is required for further improvement, and we can only wait for GPU manufacturers. Similar to how they could never have created these language models in the 80s, no matter how clever their algorithms - they just didn't have enough compute power, memory, or the internet to use as a dataset.

artsybashev t1_j2v9lx2 wrote on January 4, 2023 at 4:28 AM

If you believe in singularity, at some point we reach an infinite loop where "AI" creates better methods to run calculations that it uses to build better "AI". In a way that is already happening but once that loop gets faster and more autonomous it can find a balance where the development is "optimally" fast.

visarga t1_j2yv5hp wrote on January 4, 2023 at 10:15 PM

I hope 2023 will be the year of AI generated training data - Evolution through Large Models https://arxiv.org/abs/2206.08896