Submitted by Light991 t3_xvol6v in MachineLearning
fromnighttilldawn t1_ir549x1 wrote
Reply to comment by [deleted] in [Discussion] Best performing PhD students you know by Light991
But ADAM paper was wrong, so. It is no better than cooking up an equation, which I guess is impressive, but if you know the right people then the overall contribution is very low. Like ADAM was literally 1 or 2 steps away from whatever Hinton was doing, and Hinton was literally the co-author's (forgot his name) supervisor or something.
badabummbadabing t1_ir9bv9x wrote
It did have an error in their convergence proof (which was later rectified by other people). But
- this was only applicable to convex cost functions anyway (General convergence proof in this sense are impossible for general nonconvex problems like neural net training)
- Adam is literally the most used optimiser for neural network training, it would be crazy to deny its significance due to a technical error in a proof in an irrelevant (for this application) regime
Regarding "whatever Hinton was doing": Are you talking about RMSprop? Sure, it's another momentum optimizer. There are many of them.
Viewing a single comment thread. View all comments