Submitted by AutoModerator t3_100mjlp in MachineLearning
v2thegreat t1_j2lpumb wrote
Reply to comment by i_likebrains in [D] Simple Questions Thread by AutoModerator
These comes under hyperparameter optimization, so you will definitely need to play around with them, but here are my rules of thumb (take it with a grain of salt!)
Learning rate: start with a large learning rate (ex 10e-3), and if the model overfits, then reduce it down to 10e-6. There's a stackoverflow article that explains this quite well.
Number of epochs: it's right before your model's loss starts diverging from the validation loss. Plot them out and where they diverge is where the overfitting happens.
Batch size: large enough that the data fits in memory to speed things up in general
Viewing a single comment thread. View all comments