Submitted by hollow_sets t3_101a9gd in MachineLearning
hollow_sets OP t1_j2mh9yh wrote
Reply to comment by vidret in [D] What do you do while you wait for training? by hollow_sets
Well well guess what I just got a signal kill at evaluation hohoho (I am evaluating the model after every 1000 steps and its takes an approximate 5 hours to go through each 1000 steps) This was the first eval check so fuck
Competitive-Rub-1958 t1_j2mxgnz wrote
use `%` modulo to do a eval check before you start training (i.e 0th step). Saves a ton of time to debug, because something always goes wrong.
hollow_sets OP t1_j2mzo2p wrote
Yea, thanks for the advice :D (I was going to wait like an idiot) Fixed it now and seems like it is running properly
JustOneAvailableName t1_j2nhjks wrote
Personally I also like to eval way more often than every 5 hours. Perhaps use a smaller eval subset for every hour?
hollow_sets OP t1_j2nirn7 wrote
Sounds fair enough Current evaluation time is like 1.5 hours so I didn't go ahead with an hourly evaluation plan
Viewing a single comment thread. View all comments