derpderp3200 OP t1_j1vgi23 wrote
Reply to comment by ResponsibilityNo7189 in [D] Has any research been done to counteract the fact that each training datapoint "pulls the model in a different direction", partly undoing learning until shared features emerge? by derpderp3200
I assume this is the case early into training, but eventually the training process starts needing to "compress" information so a given parameter handles more than one very specific case, at which point it'll be subject to this phenomenon again- any dog example will want "not dog" neurons inactive, any dog example will want neurons contributing to classification of other classes inactive.
Sure, statistically you're still descending down the slope of a network that's good at each class, but this is only the case when your classes - and thus the "pull effects" are balanced, not as an intrinsic ability of the network to extract differentiating features.
Viewing a single comment thread. View all comments