SimonJDPrince
SimonJDPrince t1_j784yjf wrote
Reply to comment by SAbdusSamad in [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad
ViT is at the end of the transformers chapter. Perhaps I forgot to put it in the index?
SimonJDPrince t1_j72bw7l wrote
Explained in my forthcoming book:
https://udlbook.github.io/udlbook/
Should be a good place to start, and if it isn't then I'm really interested to know where you struggled so I can improve the explanation.
SimonJDPrince OP t1_j648umm wrote
Reply to comment by NeoKov in [P] New textbook: Understanding Deep Learning by SimonJDPrince
Thanks! Definitely a mistake. If you send your real name to the e-mail address on the website, I'll add you to the acknowledgements in the book.
Let me know if you find any more.
SimonJDPrince OP t1_j648ce9 wrote
Reply to comment by NeoKov in [P] New textbook: Understanding Deep Learning by SimonJDPrince
GitHub or e-mail are better. Only occasionally on Reddit.
SimonJDPrince OP t1_j5yc4n2 wrote
Reply to comment by NeoKov in [P] New textbook: Understanding Deep Learning by SimonJDPrince
You are correct -- they don't usually occur simultaneously. Usually, you would train and then test afterwards, but I've shown the test performance as a function of the number of training iterations, just so you can see what happens with generalization.
(Sometimes people do examine curves like this using validation data, so they can see when the best time to stop training is though)
The test loss goes back up because it classifies some of the test answers wrong. With more training iterations, it becomes more certain about it's answers (e.g., it pushes the likelihood of its chosen class from 0.9 to 0.99 to 0.999 etc.). For the training data, where the everything is classified correctly, that makes it more likely and decreases the loss. For the cases in the test data where its classified wrong, it makes it less likely, and so the loss starts to go back up.
Hope this helps. I will try to clarify in the book. It's always helpful to learn where people are getting confused.
SimonJDPrince OP t1_j5taba8 wrote
Reply to comment by arsenyinfo in [P] New textbook: Understanding Deep Learning by SimonJDPrince
Thanks. This is useful.
SimonJDPrince OP t1_j5q2nbo wrote
Reply to comment by AdFew4357 in [P] New textbook: Understanding Deep Learning by SimonJDPrince
Agreed -- in some cases. Depends on the level of the student, if they are studying in a class etc. My goal was to write the first thing you should read about each area.
SimonJDPrince OP t1_j5pdwwr wrote
Reply to comment by bythenumbers10 in [P] New textbook: Understanding Deep Learning by SimonJDPrince
That's not a bad idea actually!
SimonJDPrince OP t1_j5olz4s wrote
Reply to comment by TheMachineTookShape in [P] New textbook: Understanding Deep Learning by SimonJDPrince
Thanks! If you send your real name to the e-mail on the front page of the book, then I'll add you to the acknowledgements.
SimonJDPrince OP t1_j5olux3 wrote
Reply to comment by new_name_who_dis_ in [P] New textbook: Understanding Deep Learning by SimonJDPrince
That was kind of my impression. And I do discuss this in the chapters on transformers and regularization. Was wondering if there is more to it.
SimonJDPrince OP t1_j5ocrdo wrote
Reply to comment by [deleted] in [P] New textbook: Understanding Deep Learning by SimonJDPrince
I'd say that mine is more internally consistent -- all the notation is consistent across all equations and figures. I have made 275 new figures, whereas he has curated existing figures from papers. Mine is more in depth on the topics that it covers (only deep learning), but his has much greater breadth. His is more of a reference work, whereas mine is intended mainly for people learning this for the first time.
Full credit to Kevin Murphy -- writing book is much more work than people think, and so completing that monster is quite an achievement.
Thanks for tip about Hacker News -- that's a good idea.
SimonJDPrince OP t1_j5ocgd0 wrote
Reply to comment by bacocololo in [P] New textbook: Understanding Deep Learning by SimonJDPrince
Yes! Any tiny errors (even punctuation) are super useful! Couldn't find this though. Can you give me more info about which sentence?
SimonJDPrince OP t1_j5o0orv wrote
Reply to comment by arsenyinfo in [P] New textbook: Understanding Deep Learning by SimonJDPrince
Can you give me an example of a review article or chapter in another book that covers roughly what you expect to see?
SimonJDPrince OP t1_j5o0n0t wrote
Reply to comment by _harias_ in [P] New textbook: Understanding Deep Learning by SimonJDPrince
Yup -- some of it is a bit out of date now, but the stuff on probabilistic/graphical models is all still good and so is the geometry.
SimonJDPrince OP t1_j5o0jd9 wrote
Reply to comment by K_is_for_Karma in [P] New textbook: Understanding Deep Learning by SimonJDPrince
There are five chapters and around 100 pages. I think it would be a good start.
SimonJDPrince OP t1_j5o0gbn wrote
Reply to comment by aamir23 in [P] New textbook: Understanding Deep Learning by SimonJDPrince
Yeah -- I feel a bit bad about that, but as someone else pointed out, the title is not actually the same. I should put a link to this book on my website though, so anyone looking for this book can find it.
SimonJDPrince OP t1_j5o0cj7 wrote
Reply to comment by NihonNoRyu in [P] New textbook: Understanding Deep Learning by SimonJDPrince
I'm planning to add extra material on line for things like this where it's still unclear how important they are. If they get widely adopted, I'll incorporate into next edition.
SimonJDPrince OP t1_j5o09dv wrote
Reply to comment by like_a_tensor in [P] New textbook: Understanding Deep Learning by SimonJDPrince
I will release solutions to about half of them. Have to keep the rest back for professors. You can always message me if you want to know the solution to a particular problem.
SimonJDPrince OP t1_j5mfyr1 wrote
Reply to comment by Nhabls in [P] New textbook: Understanding Deep Learning by SimonJDPrince
I'll give people the choice in the end...
Submitted by SimonJDPrince t3_10jlq1q in MachineLearning
SimonJDPrince OP t1_ir04cnn wrote
Reply to comment by MuffinB0y in [P] New Book: Understanding Deep Learning by SimonJDPrince
Late 2023... it takes them a while to print it unfortunately.
SimonJDPrince OP t1_ir048nc wrote
Reply to comment by SeucheAchat9115 in [P] New Book: Understanding Deep Learning by SimonJDPrince
I have to keep some solutions back so that it can be used by instructors, but I'm going to make about half of them available and might add other problems to the website that aren't in the book and have answers. I haven't written out any of the answers yet, so it's possible that one or two of them aren't well-formulated. If you struggle with any of them, you can always email me.
Submitted by SimonJDPrince t3_xurvaq in MachineLearning
SimonJDPrince t1_j7htrs8 wrote
Reply to comment by 42gauge in [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad
Pretty much nothing to get through the first half. High school calculus and a basic grasp of probability. Should be accessible to almost everyone. Second half needs more knowledge of probability, but I'm filling out appendices with this info.