Submitted by jrmylee t3_1056uhp in MachineLearning
Hi r/machinelearning!
​
A few months ago I quit my job to join my partners to make training open-source models much faster and easier for engineers.
​
We're building Rubbrband. It's a web app that takes any ML repo off of GitHub, and gives you a Terminal and Jupyter Notebook in browser with dependencies and GPUs automatically set up.
​
Why did we build this?
My co-founders and I have been working on this because we found this dependency set up process super tedious and draining as researchers.
​
What's included?
- Automatic Dependency set up for any GitHub python repo
- Integrated Terminal and Notebooks
- A server with an Nvidia GPU
- Code explanations for functions
- Our pricing is simple at $75/month for 3 repos running at a time. First week is free.
​
I'd love to get your feedback on:
- Does the value we provide resonate with you? Would you try it out?
- Is dependency and environment set up take up a large chunk of your time?
We're currently working on acquiring more GPUs to onboard more users, but if you'd like access to the product please let me know.
​
Thank you very much in advance!
JackBlemming t1_j3997nh wrote
Couple thoughts:
Setting up an environment is typically harder than cloning the repo and running pip install on the requirements.txt file. Many python packages require prior linux packages to have been installed beforehand. Your service should ideally take care of this for me. Some obvious examples are opencv, cuda/gpu drivers, mysqlclients etc.
Dataset management is the most annoying part of machine learning for me, not setting up environments which is typically a dockerfile or docker-compose file, and maybe one shell script to bootstrap everything. Dataset management being allowing my models to access the dataset in a fast way, updating the dataset, etc. Ideally your service should make it easy to upload data to your dataset and then make it accessible to the training code. This is assuming you want to allow people to train models on the service.