make3333 t1_ivroe1x wrote on November 10, 2022 at 2:41 AM

Reply to comment by Difficult_Ferret2838 in [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS

gradient descent takes the direction of the minimum at the step size according to the taylor series of degree n at that point. in neural nets we do first degree, as if it was a plane. in a lot of other optimization settings they do second order approx to find the optimal direction

Difficult_Ferret2838 t1_ivrom17 wrote on November 10, 2022 at 2:43 AM

>gradient descent takes the direction of the minimum at the step size according to the taylor series of degree n at that point.

No. Gradient descent is first order by definition.

>in a lot of other optimization settings they do second order approx to find the optimal direction

It still isn't an "optimal" direction.

kksnicoh t1_ivtla47 wrote on November 10, 2022 at 2:52 PM

It is optimal in first order :)

Difficult_Ferret2838 t1_ivtprrn wrote on November 10, 2022 at 3:22 PM

Exactly, that is a meaningless phrase.