Submitted by Tiny-Mud6713 t3_yuxamo in MachineLearning
ItalianPizza91 t1_iwblluq wrote
If the training loss decreases and validation loss stays the same, this is usually a sign of overfitting. The usual steps I take to avoid this:
- use a dropout layer
- add data augmentations
- get more data
Tiny-Mud6713 OP t1_iwc1vq9 wrote
Yeah the problem is that this is a challenge and the data is limited, tried data augmentation but haven't had much luck.
However, I must ask, when using data augmentation is it better to augment the training and the validation sets or just the training?, seen conflicted opinions online.
Nhabls t1_iwc3rap wrote
You don't augment validation data, you'd be corrupting your validation scores, you'd only augment it at the end when/if you're training with all the data
Speaking of, look at your class representation %s, accuracy might be completely misleading if you have 1 or 2 overwhelmingly represented classes
Tiny-Mud6713 OP t1_iwc919l wrote
7 classes are equally distributed (500 images), only 1 has like 25% of the other data share (150-ish), it is a problem but I'm not sure how to solve it considering the fact that it's a challenge and I can't add data, augmentation will keep the imbalance since it augments everything equally.
Nhabls t1_iwcdek4 wrote
The data doesn't seem that imbalanced, not to cause the issues you're having. And idk what you are using for augmentation but you can def augment classes to specifically solve imbalance ( i don't like doing that personally). My next guess would be looking at how you're splitting the data for train/val and/or freezing the vast majority of the pretrained model and maybe even just training on the last layer or 2 that you add on top.
Regardless, it's something that's useful to know (very frequent in real world datasets) here's a link that goes over how to weigh classes for such cases it's with tensorflow in mind but it's the same concept regardless
GullibleBrick7669 t1_iwc3fyj wrote
From my understanding and performance on a recent work of mine (similar problem), augmenting just the training data is beneficial in interpreting the validation accuracy. In the sense, validation data quite literally functions as the test data with no alterations. So, when you plot the loss on training and validation, that should give you an understanding of how well the model will perform on the test data. So, for my problem I augmented just the training data and left validation and test data as is.
Also looking at your plots, it could also be a sign of unrepresentative validation data set. Ensure that there are enough data samples for each class if you find that they are not, try performing the same augmentations that you do on the training data on the validation data as well to generate more samples.
Viewing a single comment thread. View all comments