Submitted by HFSeven t3_10a8a14 in MachineLearning
Hi! So i am looking into literature for determining the usefulness of samples/datasets used for training ML model. Lets say DNN was trained with datasets A, B and C so after training is there way to quantify which of the partial triaining datasets contributed most to the useful learning by ML model at the end of training! Brute force strategy can be to remove samples and train and see how it performs but ofcourse it will not be viable!
HateRedditCantQuitit t1_j42ogtm wrote
Not exactly what you’re asking, but active learning has a lot to say on data point usefulness.