currentscurrents t1_j4jj1l6 wrote on January 16, 2023 at 3:51 AM

It's a little discouraging when every interesting paper has a cluster of 64 A100s in their methods section.

junetwentyfirst2020 t1_j4jkejb wrote on January 16, 2023 at 4:01 AM

The first image transformer is pretty clear that it works better at scale. You might not need a transformer for interesting work though.

You can do so much with that GPU. I think transformers are heavier models, but my background is on CNNs and those work fine on your GPU.