Viewing a single comment thread. View all comments

Zestyclose-Debt-4712 t1_iwu1y9w wrote

I am not sure if „Deep Learning“ would be very interesting for you. While it is a great scholarly introduction into the field, it does not covers the mathematics that deeply. I would have also suggested you to check out „Understanding Machine Learning“. It’s afaik the most thorough mathematical introduction. If you are more specifically interested in neural networks, I recently started to read „Neural Networks Learning: Theoretical foundations“ by Anthony and Bartlett and it looks promising as introducing the topic mathematically. Although it is really old.

But tbh, the age doesn’t really matter too much in my opinion. Maybe I am wrong and I missed some important publications in recent years, but to me it just doesn’t seem like there has been much development on the theoretical side of ML. Especially in the field of Deep Learning … it is still an open question why networks learn better than generalization bounds offered by theory would suggest. And afaik, there’s just haven’t been too many changes to our theoretical understanding in the last 10-20 years. Of course, many interesting phenomena have been observed empirically. But to my knowledge, they didn’t have much impact on existing theory.

But I am working more on the applications side of things, so take everything I say with a grain of salt.

3

Nanoputian8128 OP t1_iwu49sc wrote

Thanks for the reply! That is interesting to hear that there hasn't been much development in the theory in the past few years. I have always been under the impression that there is massive amount of new research in ML being done. But I guess that is more on the application side rather than the actual underlying theory.

1

Zestyclose-Debt-4712 t1_iwu67xm wrote

Well, there is a lot of research going on in every direction, although most of the breakthroughs are on the applied side. And as you can imagine, the current development of new applications, architectures, loss functions, optimizers etc etc is way to fast to have rigorous theory keep up with. So don’t worry if you read a ten year old book, because it should still give you a good idea of the foundations. After learning those, you can pick what is the most interesting to you and read papers to see how it is applied to modern models and problems.

1