Submitted by starstruckmon t3_1027geh in MachineLearning
artsybashev t1_j2suada wrote
Reply to comment by yahma in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon
A100 can run about 75B parameters in 8bit. With pruning that is doable, but it wont be quite the same perplexity.
currentscurrents t1_j2trd40 wrote
If only it could run on a card that doesn't cost as much as a car.
I wonder if we will eventually hit a wall where more compute is required for further improvement, and we can only wait for GPU manufacturers. Similar to how they could never have created these language models in the 80s, no matter how clever their algorithms - they just didn't have enough compute power, memory, or the internet to use as a dataset.
artsybashev t1_j2v9lx2 wrote
If you believe in singularity, at some point we reach an infinite loop where "AI" creates better methods to run calculations that it uses to build better "AI". In a way that is already happening but once that loop gets faster and more autonomous it can find a balance where the development is "optimally" fast.
visarga t1_j2yv5hp wrote
I hope 2023 will be the year of AI generated training data - Evolution through Large Models https://arxiv.org/abs/2206.08896
Viewing a single comment thread. View all comments