badabummbadabing t1_ir9bv9x wrote on October 6, 2022 at 8:49 AM

It did have an error in their convergence proof (which was later rectified by other people). But

this was only applicable to convex cost functions anyway (General convergence proof in this sense are impossible for general nonconvex problems like neural net training)
Adam is literally the most used optimiser for neural network training, it would be crazy to deny its significance due to a technical error in a proof in an irrelevant (for this application) regime

Regarding "whatever Hinton was doing": Are you talking about RMSprop? Sure, it's another momentum optimizer. There are many of them.