Viewing a single comment thread. View all comments

hollow_sets OP t1_j2mh9yh wrote

Well well guess what I just got a signal kill at evaluation hohoho (I am evaluating the model after every 1000 steps and its takes an approximate 5 hours to go through each 1000 steps) This was the first eval check so fuck

4

Competitive-Rub-1958 t1_j2mxgnz wrote

use `%` modulo to do a eval check before you start training (i.e 0th step). Saves a ton of time to debug, because something always goes wrong.

9

hollow_sets OP t1_j2mzo2p wrote

Yea, thanks for the advice :D (I was going to wait like an idiot) Fixed it now and seems like it is running properly

3

JustOneAvailableName t1_j2nhjks wrote

Personally I also like to eval way more often than every 5 hours. Perhaps use a smaller eval subset for every hour?

4

hollow_sets OP t1_j2nirn7 wrote

Sounds fair enough Current evaluation time is like 1.5 hours so I didn't go ahead with an hourly evaluation plan

3