Viewing a single comment thread. View all comments

techlos t1_ir131zp wrote on October 4, 2022 at 3:52 PM

two things you can do are early stopping + using a subset of your dataset.

In my experience, hyperparams that have the best convergence at 3~5 epochs will generalize to pretty good convergence on a full training run. It won't guarantee the best performance, but if you're on a budget it's a great compromise.