suflaj t1_j5r5bfw wrote on January 24, 2023 at 11:54 PM

Reply to comment by NinjaUnlikely6343 in Efficient way to tune a network by changing hyperparameters? by NinjaUnlikely6343

For learning rate you should just use a good starting point based on the batch size and architecture and relegate everything else to the scheduler and optimizer. I don't think there's any point messing with the learning rate once you find one that doesn't blow up your model, just use warmup or plateau schedulers to manage it for you after that.

Since you mentioned Inception I believe that unless you are using quite big batch sizes, your starting LR should be the magical 3e-4 for Adam or 1e-2 for SGD, and you would just use a ReduceOnPlateau scheduler with ex. patience of 3 epochs, cooldown of 2, factor of 0.1 and probably employ EarlyStopping if metric doesn't improve after 6 epochs.

NinjaUnlikely6343 OP t1_j5rjhvi wrote on January 25, 2023 at 1:35 AM

Thanks a lot! I'll try that and keep you posted