Oceanboi t1_iz2tn86 wrote on December 6, 2022 at 1:24 AM

Reply to comment by VirtualHat in [D] Determining the right time to quit training (CNN) by thanderrine

Is this natural error rate purely theoretical or is there some effort to quantify a ceiling?

If I’m understanding correctly, you’re saying there is always going to be some natural ceiling to accuracy for some problems in which the X data doesn’t hold enough information to perfectly predict Y, or in nature just doesn’t help us predict Y?

VirtualHat t1_iz3li4d wrote on December 6, 2022 at 5:13 AM

https://en.wikipedia.org/wiki/Mutual_information

eigenlaplace t1_iz7dq71 wrote on December 7, 2022 at 12:37 AM

there are problems where the target is not ideal, but it is noisy instead due to the rater being imperfect

so if you get 100% accuracy on test set, you might just be predicting wrong things because another, more experienced, rater would judge the ground truth to be different than what the first rater said

this is in fact true for most, if not all, data, except for toy/procedural datasets where you actually create the input-output pairs deterministically