Comments

You must log in or register to comment.

VirtualHat t1_izczlg3 wrote

I have a system where I can go from idea to initial results in 2-hours and full results by the next day. I've found a short loop like this critical for testing the hundreds of ideas that come to mind.

4

SeucheAchat9115 t1_izdbkkz wrote

Try to use smaller subsets of your data. It is very likely that the performance scales with the amount of data afterwards.

11

VirtualHat t1_izfu724 wrote

I use three scripts.

train.py (which trains my model)

worker.py (which picks up the next job and runs it using train.py)

runner.py (which is basically a list of jobs and code to display what's happening).

I then have multiple machines running multiple instances of worker.py. When a new job is created, the workers see it and start processing it. Work is broken into 5-epoch blocks, and at the end of each block, a new job from the priority queue is selected.

This way I can simply add a new job and within 30 minutes or so one of the workers will finish its current block and pick it up. Also because of the chunking, I get early results on all the jobs rather than having to wait for them to finish. This is important as I often know early on if it's worth finishing or not.

I evaluate the results in a Jupyter notebook using the logs that each job creates.

edit: fixed links.

5

iamr0b0tx t1_izgdmkj wrote

Check out weights and biases. I believe it can help you manage multiple experiments. As for speed you may be able to test them concurrently once you have them all set up separately. And I think someone already mentioned you can use a smaller dataset to make the process faster

2

thundergolfer t1_izgiaa4 wrote

I'm sorry to shill, but Modal.com is easily the best thing for this. Here's a demo video should how fast you can edit code, run it in the cloud, and then edit it some more all in a handful of seconds.

I was the ML Platform lead at Canva and quick iteration was the #1 pain point of our data scientists and MLEs. I left Canva to join Modal because it can do heavy serverless compute and keep your inner dev loop tight.

Again, sorry to shill, but I've been in this sub for like 8 years and think tools like Modal and Metaflow are finally getting us to a place where ML development isn't a painful mess.

1

farmingvillein t1_izi021q wrote

True, but no one has really come up with a better methodology.

The best you can do is train on smaller data + make sure that you can tell yourself a story about how the new technique will still help when data is scaled up (and then hope that you are right).

(The latter is certainly argument for staying at least semi-current with the literature, as it will help you get an intuition for what might scale up and what probably won't.)

2

mlisnifty t1_izk4hvw wrote

Yea, I'd keep my data lineage for each project stored in something like CometML. I'd probably create a different project for each idea.. so multiple training runs would be in each project, then you've got all your graphics you need to compare models of the same project, hyperparameters, code, dependencies, and data all ready for you if you decide to come back to one of the projects after chasing something else for a month.

2