newperson77777777
newperson77777777 OP t1_j3z0ep3 wrote
Reply to comment by cosentiyes in [D] Is making a dataset publicly accessible necessary for acceptance at top-tier conferences in ML? by newperson77777777
Thanks for the suggestion!
newperson77777777 OP t1_j3z0dot wrote
Reply to comment by [deleted] in [D] Is making a dataset publicly accessible necessary for acceptance at top-tier conferences in ML? by newperson77777777
Thanks for the suggestion!
newperson77777777 t1_j1tbh2m wrote
Reply to comment by skn133229 in [D] Normalized images in UNET by skn133229
Weird, haven't encountered that issue before. If you do the feature-wise normalization that I discussed before, more likely than not that will result in, for most images, the image channel mean not being 0 as well, if that helps.
newperson77777777 t1_j1r9oni wrote
Reply to comment by skn133229 in [D] Normalized images in UNET by skn133229
So based on my understanding, training fails when you try to normalize per image, right? When you don't normalize per-image, training is fine but you just get suboptimal validation performance - which, in this case, you are saying is because there may be more error during certain years, especially years that are not considered during training. Is that an accurate summary?
If this is the case, I would try to test two things independently (not at the same time): 1. standard normalization over the entire dataset and 2. standard normalization per year. I would also explore the data based on year year and try to see what are the differences to see how you can adjust your modeling. Additionally, I would try do a more thorough error analysis to understand patterns in the errors the model is making.
newperson77777777 t1_j1qv2cz wrote
Reply to [D] Normalized images in UNET by skn133229
You may want to check for bugs as well with your normalization process. Additionally, generally, if you're doing standard normalization for images, you do it over the training set rather than per image. So you get the mean and standard deviation over your training set and calculate the normalized images. However, you seem to suggest that there are some differences in the images depending on the year and other qualities. Thus, you may want to standard normalize by year or something else.
I'm assuming this is a segmentation problem? How do you know the model is memorizing the input range and not learning the spatial patterns? If there's a memorization issue, the model may be overfitting - thus, you may want to consider using data augmentation or more data or try simpler models.
newperson77777777 t1_itcycr1 wrote
I feel like in practice, in data science, you can rarely justify things theoretically and mostly rely on best practices for certainty of conclusions.
newperson77777777 OP t1_j437zvq wrote
Reply to comment by chatterbox272 in [D] Is making a dataset publicly accessible necessary for acceptance at top-tier conferences in ML? by newperson77777777
Thanks for your perspective.