Submitted by faker10101891 t3_10cxuo2 in MachineLearning
sayoonarachu t1_j4n2w5j wrote
Quite a bit and even more if you use optimized frameworks and packages like voltaml, pytorch lighting, colossalai, bitsandbytes, xformers, etc. Those are just the ones I am familiar with.
Some libraries allow balancing between cpu, gpu, and memory, though obviously, that will come at a cost of speed.
General rule, the more parameters the model, the higher the cost of memory. So, unless you're planning to train from scratch or fine tune in the billions of param, you'll be fine.
It's gonna take playing around with hyper parameters, switching between 32, 16, 8 bit quant with pytorch or other python packages, testing between offloading weights to gpu/cpu, etc to get a feel of what you can and can't do.
Also, if I remember correctly, pytorch 2.0 will somewhat benefit the consumer nvidia 40 series to some extent when it is more ready.
Edit: p.s. supposedly a new Forward Forward algorithm can be "helpful" for large models since there's no back propagation
faker10101891 OP t1_j4n6a3b wrote
Thanks, I'll check that out!
Viewing a single comment thread. View all comments