Submitted by skn133229 t3_zvraje in MachineLearning
I am working on a unet model that takes as input 64x64 landsat imagery and outputs various classes of agricultural features. The training works ok when I scale the surface reflectance (SR) values to 0-1 (i.e. divide raw SR by the 16bit max constant 65536). What I've noticed is that the model seems to be memorizing the range of values in each image and not learning the shapes and spatial patterns as much. The result is that predictions vary a bit too much from year to year and years not appearing in the training dataset have suboptimal predictions. Batch normalization does not seem to change anything. Model converges faster but the problem remains.
What I've tried to do is normalize each image individually by subtracting each channel by its mean and dividing by its standard deviation. This maintains the relative spatial patterns and shapes but bring all images to a mean of 0 and standard deviation of 1. Feeding these normalized images to the model does not work. I get precision and recall of 0. Pretty much all predictions were 0. Is there a reason why this would happen? Am I missing something about the way unet works? Any insight would be appreciated.
Update: This may be useful to others. I was able to resolve the problem. Apparently, having the each image channel averaging to 0 does not allow the model to train properly and converge. What I've done is randomly shift the mean of each image and adding some random jitter. The model can then train properly. Thanks everyone for your insights. Next, I will evaluate the new model for accuracy of year over year predictions.
newperson77777777 t1_j1qv2cz wrote
You may want to check for bugs as well with your normalization process. Additionally, generally, if you're doing standard normalization for images, you do it over the training set rather than per image. So you get the mean and standard deviation over your training set and calculate the normalized images. However, you seem to suggest that there are some differences in the images depending on the year and other qualities. Thus, you may want to standard normalize by year or something else.
I'm assuming this is a segmentation problem? How do you know the model is memorizing the input range and not learning the spatial patterns? If there's a memorization issue, the model may be overfitting - thus, you may want to consider using data augmentation or more data or try simpler models.