VirtualHat t1_izczlg3 wrote on December 8, 2022 at 4:46 AM

#883,238

I have a system where I can go from idea to initial results in 2-hours and full results by the next day. I've found a short loop like this critical for testing the hundreds of ideas that come to mind.

SeucheAchat9115 t1_izdbkkz wrote on December 8, 2022 at 6:56 AM

#883,752

Try to use smaller subsets of your data. It is very likely that the performance scales with the amount of data afterwards.

SeucheAchat9115 t1_izdbmzj wrote on December 8, 2022 at 6:57 AM

#883,756

Replying to SeucheAchat9115 (#883,752)

Or you could compare your training after e.g. two epochs and only run the best for 500 Epochs

1bir t1_izelhl5 wrote on December 8, 2022 at 3:24 PM

#885,843

Replying to VirtualHat (#883,238)

>I have a system where I can go from idea to initial results in 2-hours

I think the OP is asking for a description of that...

AmalgamDragon t1_izfizm8 wrote on December 8, 2022 at 7:03 PM

#887,702

Replying to VirtualHat (#883,238)

Pics or it didn't happen (i.e. please share the details of this system).

VirtualHat t1_izfu724 wrote on December 8, 2022 at 8:15 PM

#888,324

Replying to 1bir (#885,843)

I use three scripts.

train.py (which trains my model)

worker.py (which picks up the next job and runs it using train.py)

runner.py (which is basically a list of jobs and code to display what's happening).

I then have multiple machines running multiple instances of worker.py. When a new job is created, the workers see it and start processing it. Work is broken into 5-epoch blocks, and at the end of each block, a new job from the priority queue is selected.

This way I can simply add a new job and within 30 minutes or so one of the workers will finish its current block and pick it up. Also because of the chunking, I get early results on all the jobs rather than having to wait for them to finish. This is important as I often know early on if it's worth finishing or not.

I evaluate the results in a Jupyter notebook using the logs that each job creates.

edit: fixed links.

iamr0b0tx t1_izgdmkj wrote on December 8, 2022 at 10:24 PM

#889,261

Check out weights and biases. I believe it can help you manage multiple experiments. As for speed you may be able to test them concurrently once you have them all set up separately. And I think someone already mentioned you can use a smaller dataset to make the process faster

[deleted] OP t1_izghg4i wrote on December 8, 2022 at 10:51 PM

#889,413

Replying to iamr0b0tx (#889,261)

[deleted]

thundergolfer t1_izgiaa4 wrote on December 8, 2022 at 10:57 PM

#889,449

I'm sorry to shill, but Modal.com is easily the best thing for this. Here's a demo video should how fast you can edit code, run it in the cloud, and then edit it some more all in a handful of seconds.

I was the ML Platform lead at Canva and quick iteration was the #1 pain point of our data scientists and MLEs. I left Canva to join Modal because it can do heavy serverless compute and keep your inner dev loop tight.

Again, sorry to shill, but I've been in this sub for like 8 years and think tools like Modal and Metaflow are finally getting us to a place where ML development isn't a painful mess.

[deleted] OP t1_izgm9z4 wrote on December 8, 2022 at 11:25 PM

#889,603

[deleted]

iamr0b0tx t1_izgmd6p wrote on December 8, 2022 at 11:26 PM

#889,610

Replying to [deleted] (#889,413)

It helps you manage experiments as a researcher

moyle t1_izgsce9 wrote on December 9, 2022 at 12:10 AM

#889,842

Replying to VirtualHat (#888,324)

Guild.ai can easily automate this pocess. I really recommend checking it out

VirtualHat t1_izgvx9j wrote on December 9, 2022 at 12:38 AM

#889,985

Replying to moyle (#889,842)

This looks great.

fasttosmile t1_izgxj4n wrote on December 9, 2022 at 12:50 AM

#890,048

Replying to SeucheAchat9115 (#883,752)

Careful. There are literally dozens of LMing papers that get an improvement on PTB which do not scale to larger datasets.

RSchaeffer t1_izgxqod wrote on December 9, 2022 at 12:52 AM

#890,060

Replying to VirtualHat (#888,324)

These links don't work for me. Can you double check them?

thundergolfer t1_izgyu6x wrote on December 9, 2022 at 1:01 AM

#890,105

Replying to RSchaeffer (#890,060)

They're not actually links, they've just been formatted like they are. They just link to train.py which is not a website.

farmingvillein t1_izi021q wrote on December 9, 2022 at 6:21 AM

#891,692

Replying to fasttosmile (#890,048)

True, but no one has really come up with a better methodology.

The best you can do is train on smaller data + make sure that you can tell yourself a story about how the new technique will still help when data is scaled up (and then hope that you are right).

(The latter is certainly argument for staying at least semi-current with the literature, as it will help you get an intuition for what might scale up and what probably won't.)

VirtualHat t1_izjmbm0 wrote on December 9, 2022 at 4:21 PM

#893,793

Replying to RSchaeffer (#890,060)

Oh my bad, didn't realise Reddit automatically created links when writing abc.xyz. I've edited the reply to include links to my code.

mlisnifty t1_izk4hvw wrote on December 9, 2022 at 6:15 PM

#894,506

Yea, I'd keep my data lineage for each project stored in something like CometML. I'd probably create a different project for each idea.. so multiple training runs would be in each project, then you've got all your graphics you need to compare models of the same project, hyperparameters, code, dependencies, and data all ready for you if you decide to come back to one of the projects after chasing something else for a month.

GinoAcknowledges t1_izl699d wrote on December 9, 2022 at 10:23 PM

#896,103

Replying to thundergolfer (#889,449)

This is great. I would encourage my organization to use this, except the restriction to T4 GPUs renders this somewhat unusable for us. What’s the ETA on more modern GPUs?

[D] Workflows for quickly iterating over ideas without free access to super computers

Comments

VirtualHat t1_izczlg3 wrote on December 8, 2022 at 4:46 AM

SeucheAchat9115 t1_izdbkkz wrote on December 8, 2022 at 6:56 AM

SeucheAchat9115 t1_izdbmzj wrote on December 8, 2022 at 6:57 AM

1bir t1_izelhl5 wrote on December 8, 2022 at 3:24 PM

AmalgamDragon t1_izfizm8 wrote on December 8, 2022 at 7:03 PM

VirtualHat t1_izfu724 wrote on December 8, 2022 at 8:15 PM

iamr0b0tx t1_izgdmkj wrote on December 8, 2022 at 10:24 PM

[deleted] OP t1_izghg4i wrote on December 8, 2022 at 10:51 PM

thundergolfer t1_izgiaa4 wrote on December 8, 2022 at 10:57 PM

[deleted] OP t1_izgm9z4 wrote on December 8, 2022 at 11:25 PM

iamr0b0tx t1_izgmd6p wrote on December 8, 2022 at 11:26 PM

moyle t1_izgsce9 wrote on December 9, 2022 at 12:10 AM

VirtualHat t1_izgvx9j wrote on December 9, 2022 at 12:38 AM

fasttosmile t1_izgxj4n wrote on December 9, 2022 at 12:50 AM

RSchaeffer t1_izgxqod wrote on December 9, 2022 at 12:52 AM

thundergolfer t1_izgyu6x wrote on December 9, 2022 at 1:01 AM

farmingvillein t1_izi021q wrote on December 9, 2022 at 6:21 AM

VirtualHat t1_izjmbm0 wrote on December 9, 2022 at 4:21 PM

mlisnifty t1_izk4hvw wrote on December 9, 2022 at 6:15 PM

GinoAcknowledges t1_izl699d wrote on December 9, 2022 at 10:23 PM