Submitted by LightGreenSquash t3_yrsqcz in MachineLearning
banmeyoucoward t1_iw6r361 wrote
You have to learn by doing, but you can do a surprising amount with small data, which will mean you can implement a paper and learn a whole lot faster since you aren't waiting on training. For example, if all you have is MNIST:
Supervised MLP classifier
Supervised convolutional classifier
Supervised transformer classifier
MLP GAN
Convolutional GAN
Gan regularizers (W-GAN, GAN-GP, etc- https://avg.is.mpg.de/publications/meschedericml2018 is mandatory reading + replicate experiments if you want to work on GANs)
Variational Autoencoder
Vector quantized variational autoencoder (VQVAE)
Diffusion model
Represent MNIST Digits using an MLP that maps pixel x, y -> brightness (Kmart NeRF)
I've done most of these projects (still need to do diffusion and my vqvae implementation doesn't work) and they each take about 2 days to grok the paper, translate to code, and implement on MNIST (~6 hours of coding?) using pytorch and the pytorch documentation + reading the relevant papers. very educational!
LightGreenSquash OP t1_iwi9q1g wrote
Yep, that's kind of along the lines I'm thinking as well. The only possible drawback I can see is that for such small datasets even "basic" architectures like MLPs can do well enough and thus you might not be able to see the benefit, say, a ResNet brings.
It's still very much a solid approach though, and I've used it in the past to deepen my knowledge of stuff I already knew, e.g. coding a very basic computational graph framework and then using it to train an MLP on MNIST. It was really cool to see my "hand-made" graph topological sort + fprop/bprop methods written for different functions actually reach 90%+ accuracy.
Viewing a single comment thread. View all comments