Comments

You must log in or register to comment.

Sundar1583 t1_ivpjqg3 wrote

In general you need to search for one. Fitting multiple models with a log scale of learning rates and comparing there performance.

If you just want to mess with the model, the default learning rate of 10^-3 usually works very well.

3

Sundar1583 t1_ivpwl4d wrote

10^-3 would be the highest learning rate I’d recommend. It’s just a starting point. Ideally if you want to do search for model performance start at either 10^-3 or 10^-4 and go from there.

1

_Arsenie_Boca_ t1_ivqr1k6 wrote

It depends on the model and the task, so there is no general answer. But you dont have to search randomly. Plot your loss over time. If the lr is too high, the loss will behave almost randomly, and its almost constant if lr is too low

1

ab_11nav t1_ivrjn4y wrote

You can try decay rate. Usually, a starting learning rate / number of epochs works good for me.

1

emad_eldeen t1_ivw9rd5 wrote

There's no rule of thumb, but usually, you use less learning rate in fine-tuning than the one used in pretraining.

1

BrohammerOK t1_ivwkn3i wrote

Adam with lr=1e-4 for fine tunning with decay or decrease on plateau always works pretty well for me with convnets.

2