hollow_sets OP t1_j2n6nmg wrote on January 2, 2023 at 4:07 PM

Reply to comment by Tgs91 in [D] What do you do while you wait for training? by hollow_sets

Academia for now
Since Im a student (bachelors) and no one wants someone with just a bachelor's so I can't really enter the industry properly even if I want to

Tgs91 t1_j2na3mm wrote on January 2, 2023 at 4:32 PM

As a student, you should take the time to work on code cleanup. Usually I see students use one big training script that has a lot going on. For my projects I typically build out a pip installable module with submodules for preprocessing/structuring raw data, model building with lots of kwargs so it can be customized, dataset objects with transformations or randomness etc for batch loading efficiently, etc etc. My actual training scripts are only a few lines of code. Hyperparams in all caps at the top, import functions from my module, and call the functions. And my modules are written in a way that employees of various skill levels can contribute to the project. Myself and another colleague do all of the more advanced AI work, but any member of the team can be a USER of the module, and we have more general data scientists that can contribute to preprocessing code, containerization, post processing tools, etc.

Even if you don't do a full module, make a utils.py file to pull out any long pieces of code and write it as an importable function. Use docstrings for every function with Google's docstring style guide (or use the autodocstring extension on VSCode, it's great). Use a linter like flake8 or black to make sure your code looks clean and professional. This all seems like minor, tedious stuff, but if you have to go back and edit/maintain code you wrote a year ago, it's a lifesaver. And it also means that in an industry environment, another coworker can step in and easily understand and edit your code. It might not make a functional difference to you right now, but good, clean, professional code is great on a resume.

hollow_sets OP t1_j2ndrb9 wrote on January 2, 2023 at 4:56 PM

This sounds like a good plan to do while I wait for the model to train.

I'll start from tomorrow (since its 10:30 pm and I feel like I have burnt myself out for the day fixing the errors) Hope no more errors pop up while I sleep