CPOOCPOS OP t1_ivp4z7j wrote
Reply to comment by jnez71 in [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS
Thanks!! Wish you a good day!!
jnez71 t1_ivp6ril wrote
Oh I should add that from a nonconvex optimization perspective, the volume-averaging could provide heuristic benefits akin to GD+momentum type optimizers. (Edited my first comment to reflect this).
Try playing around with your idea in low dimensions on a classical computer to get a feel for it first. Might help you think of new ways to research it.
Viewing a single comment thread. View all comments