Difficult_Ferret2838 t1_ivrnegq wrote
Reply to comment by make3333 in [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS
That doesn't mean anything.
make3333 t1_ivroe1x wrote
gradient descent takes the direction of the minimum at the step size according to the taylor series of degree n at that point. in neural nets we do first degree, as if it was a plane. in a lot of other optimization settings they do second order approx to find the optimal direction
Difficult_Ferret2838 t1_ivrom17 wrote
>gradient descent takes the direction of the minimum at the step size according to the taylor series of degree n at that point.
No. Gradient descent is first order by definition.
>in a lot of other optimization settings they do second order approx to find the optimal direction
It still isn't an "optimal" direction.
kksnicoh t1_ivtla47 wrote
It is optimal in first order :)
Difficult_Ferret2838 t1_ivtprrn wrote
Exactly, that is a meaningless phrase.
Viewing a single comment thread. View all comments