v2thegreat t1_j2lpumb wrote on January 2, 2023 at 6:30 AM

Reply to comment by i_likebrains in [D] Simple Questions Thread by AutoModerator

These comes under hyperparameter optimization, so you will definitely need to play around with them, but here are my rules of thumb (take it with a grain of salt!)

Learning rate: start with a large learning rate (ex 10e-3), and if the model overfits, then reduce it down to 10e-6. There's a stackoverflow article that explains this quite well.

Number of epochs: it's right before your model's loss starts diverging from the validation loss. Plot them out and where they diverge is where the overfitting happens.

Batch size: large enough that the data fits in memory to speed things up in general