zbyte64 t1_j74y5o9 wrote
What kind of hardware do I need to train this?
__lawless t1_j76wlwb wrote
They did it on 4 V100 with 32GB RAM
Balance- t1_j8copxz wrote
Damn, imagine what happens when you throw a A100 or H100 datacenter against it for a few months
dancingnightly t1_j76t0gh wrote
In theory training T5 alongiside the image embedding models they use (primarily DETR?) shouldn't take much more than a 3090 or Collab Pro GPU. You could train T5s on even consumer high end GPUs in 2020, for example, but the DETR image model probably needs to be ran for each image at the same time which might take up quite a bit of GPU together. The `main.py` script looks like a nice and fairly short typical training script you'd be able to quickly run if you download their repo, pull the scienceQA dataset and send the training args to see if it crashes.
Viewing a single comment thread. View all comments