MadScientist-1214 t1_j6433qc wrote on January 27, 2023 at 3:58 PM

At my institute, nobody trained on ImageNet, so I had to figure it out myself too. If you train on architectures like VGG, it does not take long. <2 days on a single A100, with worse GPU max. 5 days. The most important thing is to use SSD, this increases speed by around 2 days. A good learning scheduler is really important. Most researchers ignore the test set, use only validation set. And also important: use mixed precision. You should really tune the training speed, if you need to do a lot of experiments.