IndecisivePhysicist t1_j20lxts wrote on December 28, 2022 at 8:10 PM

Reply to comment by derpderp3200 in [D] Has any research been done to counteract the fact that each training datapoint "pulls the model in a different direction", partly undoing learning until shared features emerge? by derpderp3200

Converge to what though? My whole point was that you don't want to converge to the actual global minimum on the test set, you want one of the many local minima and you want one that is flat.