Submitted by faker10101891 t3_10cxuo2 in MachineLearning
junetwentyfirst2020 t1_j4jgu4t wrote
I’m not sure why you think that that’s such a crummy graphics card. I’ve trained a lot of interesting things for grad school and even in the work place on 4GB less. If you’re fine tuning then it’s not really going to take that long to get decent results, and 16 GB is not bad.
currentscurrents t1_j4jj1l6 wrote
It's a little discouraging when every interesting paper has a cluster of 64 A100s in their methods section.
junetwentyfirst2020 t1_j4jkejb wrote
The first image transformer is pretty clear that it works better at scale. You might not need a transformer for interesting work though.
You can do so much with that GPU. I think transformers are heavier models, but my background is on CNNs and those work fine on your GPU.
Viewing a single comment thread. View all comments