Viewing a single comment thread. View all comments

gdahl t1_iqpf8j8 wrote

Adam is more likely to outperform steepest descent (full batch GD) in the full batch setting than it is to outperform SGD at batch size 1.

2