junetwentyfirst2020
junetwentyfirst2020 t1_j9rghm4 wrote
Reply to comment by suflaj in Why bigger transformer models are better learners? by begooboi
You’re being very loose with the word noise here.
junetwentyfirst2020 t1_j9bn2hn wrote
Reply to comment by No-Celebration6994 in Entry to a career in deep learning by No-Celebration6994
Look into computer vision courses in university websites and you should see the range
junetwentyfirst2020 t1_j7y3cro wrote
Reply to Entry to a career in deep learning by No-Celebration6994
What do you want to be doing exactly at this job? It’s a semi broad field, even with the specification of computer vision. I’ve usually seen computer vision broken down into: Capture, Perception, and 3D Reconstruction.
Deep learning usually happens on the Capture and Perception parts of the pipelines, because 3DR is Geometry and linear algebra.
Is this what you want?
junetwentyfirst2020 t1_j7l7ctn wrote
Reply to [D] Should I focus on python or C++? by NoSleep19
It depends what you want to do. Deep Learning is pretty much to be Python, but 3D Reconstruction is almost exclusively C++.
If you want to do robotics, do you want to do Deep Learning for robotics, or do you want to do 3DR? Same question for medical imagine.
Also is what you want to work on run in a cloud service like GCP, or is it run on device? If it’s run on device there’s like 100% change it’s C++.
My plan is to know both.
junetwentyfirst2020 t1_j7l55ym wrote
Reply to comment by visarga in Does the high dimensionality of AI systems that model the real world tell us something about the abstract space of ideas? [D] by Frumpagumpus
That is so true
junetwentyfirst2020 t1_j7ivs75 wrote
Reply to comment by Sharchimedes in Does the high dimensionality of AI systems that model the real world tell us something about the abstract space of ideas? [D] by Frumpagumpus
Why the word “guessing”?
junetwentyfirst2020 t1_j7ivn59 wrote
Reply to comment by Ok_Listen_2336 in Does the high dimensionality of AI systems that model the real world tell us something about the abstract space of ideas? [D] by Frumpagumpus
I agree. It’s also important to remember that the brain is just the architecture definition and the mind the model. The ML models and the mind model are unrelated, however.
junetwentyfirst2020 t1_j7c7gpj wrote
You should consider diving into the topic a little deeper. What you’re talking about is distributing the computation, which is something that is already being done at some scale or another when there is more than one gpu or multiple machines. An outside example of this that you can donate your computers compute to SETI.
Your question about wether it can beat an existing implementation of gpt is the most complicated question ever posed in the history of humanity. It sounds like you’re assuming that this will have more compute than a dedicated system, but there’s a little more to getting something that performs better than just compute. Compute is a bottle neck, but only one of many.
junetwentyfirst2020 t1_j6g911v wrote
CS231n on YouTube. It’s a little bit older but it has just about everything i’d ask if I was giving you a job interview.
junetwentyfirst2020 t1_j51r600 wrote
Reply to comment by tennismlandguitar in [D] ML Researchers/Engineers in Industry: Why don't companies use open source models more often? by tennismlandguitar
I refuse to answer on the grounds that I may purger myself
junetwentyfirst2020 t1_j4z0oyt wrote
Reply to [D] ML Researchers/Engineers in Industry: Why don't companies use open source models more often? by tennismlandguitar
🤫 they do. But there tends to be licensing issues, so they don’t.
junetwentyfirst2020 t1_j4wrzut wrote
Reply to Why a pretrained model returns better accuracy than the implementation from scratch by tsgiannis
The way I like to think about this is that the algorithm has to model many things. If you’re trying to learn whether the image contains a dog or not, first you have to model natural imagery, correlations between features, and maybe even a little 2D-to-3D to simplify invariances. I’m speaking hypothetically here, because the underlying model is quite latent and hard to inspect.
If you train from scratch you need to do all of these tasks on a dataset that is likely much smaller than is required to do all of them without overfitting. If you use a pretrained model, instead of learning all of those tasks, you instead have a model that only has to learn just one additional thing on the same amount of data.
junetwentyfirst2020 t1_j4wcwxz wrote
Reply to comment by Acceptable-Cress-374 in [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
It’s important to remember that these models are statistically robust. So while you may get a false positive or false negative, it does not reflect on the robustness of the model.
junetwentyfirst2020 t1_j4ntud9 wrote
Reply to [P] Looking for a CV/ML freelancer by bluebamboo3
Sure. I’ll need an iOS engineer as well and god know what models are supported on device currently, so it’ll be 250k and I’ll need 6 months.
junetwentyfirst2020 t1_j4n6amt wrote
Reply to [P] A small tool that shuts down your machine when GPU utilization drops too low. by nateharada
👍 very cool
junetwentyfirst2020 t1_j4jkejb wrote
Reply to comment by currentscurrents in [D] What kinds of interesting models can I train with just an RTX 4080? by faker10101891
The first image transformer is pretty clear that it works better at scale. You might not need a transformer for interesting work though.
You can do so much with that GPU. I think transformers are heavier models, but my background is on CNNs and those work fine on your GPU.
junetwentyfirst2020 t1_j4jgu4t wrote
I’m not sure why you think that that’s such a crummy graphics card. I’ve trained a lot of interesting things for grad school and even in the work place on 4GB less. If you’re fine tuning then it’s not really going to take that long to get decent results, and 16 GB is not bad.
junetwentyfirst2020 t1_j4hqfvl wrote
Reply to comment by blacksnowboader in Is a MSc in ML and industry experience enough for an ML Research Engineer position? [D] by 400Volts
Stanford university course taught by Andrew Karpathy. It’s a little older now but I do think it covers important material. You can find it on YouTube
junetwentyfirst2020 t1_j4h8p07 wrote
Reply to comment by 400Volts in Is a MSc in ML and industry experience enough for an ML Research Engineer position? [D] by 400Volts
If you want a job with the title research in it, then you are 99% going to need top tier conference publications in your masters. Even one ICCV, ECCV, CVPR should be enough, but they are very competitive. I wish I knew that a masters was different from an undergrad because I was completely unready.
I’d suggest reading some research papers to gauge your math, especially. All of Computer Science for ML/DL is basically applied math contributions. Look up the papers noted in the course CS231N and if you can’t get through them, then you need to improve your math skills. I wish someone told me this before my masters because my math sucked and it held me back significantly, and it’s hard to try to both do a masters and then play catch up on math because the masters itself is a lot of work.
I have an undergrad and masters in CS, thesis on DL, and 3.5 years industry experience as a Machine Learning/Computer Vision Engineer and I don’t even both applying for jobs that say Research in the title because everyone in the world with a pub is applying for those same jobs.
You can do it if your math is solid (linear algebra, calculus, and probability), knowing how to code is needed but not the most needed thing and you can tell my the horrible research code out there, so don’t rely solely on your software engineering skills.
junetwentyfirst2020 t1_j4h4p4s wrote
Reply to Is a MSc in ML and industry experience enough for an ML Research Engineer position? [D] by 400Volts
Do you already have a masters or not?
junetwentyfirst2020 t1_j3rocq5 wrote
I wish my intuitions were so good that I could find research papers where someone did it and it kicked butt. You should take some time to appreciate your brain.
junetwentyfirst2020 t1_j3kqzbm wrote
Reply to comment by vagartha in Building an NBA game prediction model - failing to improve between epochs by vagartha
Every time! Training can take a long time so I’d hate to walk away and come back the next day and see it stuck 😭 this will work even if your labels are incorrect.
junetwentyfirst2020 t1_j3kmct1 wrote
Have you tired to overfit on a single piece of data to ensure that your model can actually learn? You should be able to get effectively 100% acc overfitting. If you can’t do this then you have a problem
junetwentyfirst2020 t1_j3hntqw wrote
Reply to comment by Yo_Soy_Jalapeno in [Discussion] Is there any alternative of deep learning ? by sidney_lumet
This should work, thank you :)
junetwentyfirst2020 t1_ja0wjzs wrote
Reply to comment by Brunt__ in [D] Looking for someone to do a small coding job by Brunt__
Most people in this field who are able to get jobs in this field have an undergrad in computer science, and a masters degree. It’s applied math + computer science, which is different from being a web developer. There are no people with these degrees who are struggling to find work currently, and they command relatively high salaries at their jobs (>150k USD guaranteed).
You might be able to find a regular dev who could put this together, but if something doesn’t work out of the box the chances that they’ll know how to address the problem is pretty much zero because it’s not just a coding issue. We don’t even look at resumes that don’t have a masters degree because it really is important that the candidate can do all kinds of math, knows the family of algorithms, how to train DL models well, can explain why something did or didn’t work, can do analysis of data and results, and can also write efficient code. LOL it’s a stressful field 😝