Submitted by NinjaUnlikely6343 t3_10kecyc in deeplearning
Hello all!
Absolute noob here. I'm trying to optimize an image classifier using transfer learning from InceptionV3 (last layer being 'Mixed 7') and fine-tuned with a small convolutional network on top. So far, I find that changing hyperparameters yields modest (if any) changes in performance and each attempt takes a prohibitive amount of time. I was thus wondering if there were any way to systematically test out multiple changes in hyperparameters without just manually changing one at a time in incremental fashion.
suflaj t1_j5qb32y wrote
There is this: https://www.microsoft.com/en-us/research/blog/%C2%B5transfer-a-technique-for-hyperparameter-tuning-of-enormous-neural-networks/
However, it's unlikely to help in your case. The best thing you can do is grid search if you know something about the problem, or just random search. I prefer random search even if I'm am expert for the problem, ESPECIALLY with ML models.
But I'm curious how it takes a long time. You don't have to train the whole dataset. Take 10% for training and 10% for validation, or less if that dataset is huge. You just need enough data to learn something. Then your optimal hyperparameters are a good enough approximation.
Also, it might help to just not tune redundant hyperparameters. Layer sizes are usually such, as is almost any hyperparameter in the Adam family of optimizers besides learning rate and to a lesser extent first momentum. Which ones are you optimizing?