Submitted by NinjaUnlikely6343 t3_10kecyc in deeplearning
suflaj t1_j5r5bfw wrote
Reply to comment by NinjaUnlikely6343 in Efficient way to tune a network by changing hyperparameters? by NinjaUnlikely6343
For learning rate you should just use a good starting point based on the batch size and architecture and relegate everything else to the scheduler and optimizer. I don't think there's any point messing with the learning rate once you find one that doesn't blow up your model, just use warmup or plateau schedulers to manage it for you after that.
Since you mentioned Inception I believe that unless you are using quite big batch sizes, your starting LR should be the magical 3e-4 for Adam or 1e-2 for SGD, and you would just use a ReduceOnPlateau scheduler with ex. patience of 3 epochs, cooldown of 2, factor of 0.1 and probably employ EarlyStopping if metric doesn't improve after 6 epochs.
NinjaUnlikely6343 OP t1_j5rjhvi wrote
Thanks a lot! I'll try that and keep you posted
Viewing a single comment thread. View all comments