Submitted by itsstylepoint t3_z1vh52 in MachineLearning

Hi folks,

stylepoint here.

I am about to be done with implementing traditional ML models and approaches and as promised, will be moving into more advanced models and techniques. Not that I have implemented every single traditional ML model, but I think this should be enough for the time being (implemented Gaussian Naive Bayes, K-Nearest Neighbors, Linear Regression, Logistic Regression, and K-Means Clustering using NumPy).

The list I currently have in mind:

  1. VGG models (image/signal classification)
  2. Two-Tower Models (recommender systems)
  3. Autoencoders (compression and embedding generation)
  4. Siamese Neural Network (similarity and few-shot learning)
  5. Prototypical Networks (few-shot learning)
  6. Enc-Dec, Enc-Enc, Dec-Dec Transformers (translation, generation, etc.)

Let me know what you folks think would be helpful (is my list good enough?). More exotic models are also welcome. Does not have to be a model either - can be a neat technique for example.

All of the videos are and will be available on my YouTube channel. Implementations are and will be in this GitHub repo.

NOTE: "from scratch" here means using NumPy or PyTorch. Using tools provided by these libraries is okay for basic constructs that are not too difficult to implement or for those I have already made a video about.

26

Comments

You must log in or register to comment.

MUSEy69 t1_ixd12bw wrote

Great work, why don't you try stable diffusion? I think the topic has enough momemtum to boost your channel.

8

airelfacil t1_ixe63tk wrote

How about Gradient Boosting models such as XGBoost? On that note, I'm seeing a lack of decision tree/random forest models lol

3

xl0 t1_ixegyz5 wrote

Don't just "implement" the models - implement the training loop in "pure" PyTorch, including mixed precision, gradient accumulation and metrics. It's not super hard but gives much-needed insight into why higher-level frameworks (like fastai or lightning) do things the way they do them.

And then actually get the models to train and see if you can replicate the results in the paper, at least some of them. You can train on smaller datasets like imaganette instead of imagenet if you don't have resources. If you can spend some money, vast.ai is good for relatively long-running tasks.

7

itsstylepoint OP t1_ixeisda wrote

Yes, that is how it usually works with my impls! (check out a few vids)

As for mixed precision and metrics - I will be making separate vids for both and of course, for every implemented model, will try to find a dataset to demo train/eval.

It is cool that you mentioned mixed precision as I already have the materials ready for this vid - will be discussing mixed precision, quantization (post-training and quantization aware training), pruning, etc. Improving perf!

4

xl0 t1_ixekt7a wrote

Cool, had a glance at a couple of your videos. They are pretty good, the production quality is good enough, and the explanations are clear.

One suggestion - maybe you could use notebooks? Can't overstate the importance of being able to interact with the code and visualize the data bit by bit, as you are writing the code. It makes it much easier to follow and understand what's going on in the code.

2

itsstylepoint OP t1_ixenran wrote

Hey thanks.

I am not a big fan of notebooks and rarely use them. When I do, I prefer using VS Code notebooks. So maybe I will make a few vids with notebooks in the future, but will likely stick to Neovim.

P.S. As for loss plots, monitoring performance, and those kinds of things, I prefer using tools like WandB, TensorBoard, etc. Will be covering those as well.

2

broadenandbuild t1_ixfq17i wrote

Multimodal models that take in tabular + text + image data

2

biophysninja t1_ixfx9yv wrote

Andrej Karpathy has been creating amazing videos on his channel implementing language models from scratch. If you can create videos at the level of accessibility while presenting fundamental concepts, you will make a difference.

1

PrimaCora t1_ixg4a8p wrote

Popped in to say something similar. Having its dataset not be half improperly cropped images and proper tags (like from booru's) could help, but the initial cost is massive.

1

KingsmanVince t1_ixh5snh wrote

Faster RCNN with Feature Pyramid Network

1