Your problem is the U-Net backbone, not the loss function. Assuming that you're married to a batch size of 4, the final convolution to get to 4 x 200 x 500 x 500, crossentropy, and the backpropagation should only take maybe 10 GB, so cram your architecture into the remaining 30GB
QuadmasterXLII t1_ja7yo0f wrote
Reply to comment by QuadmasterXLII in [D] Training a UNet-like architecture for semantic segmentation with 200 outcome classes. by Scared_Employer6992
Your problem is the U-Net backbone, not the loss function. Assuming that you're married to a batch size of 4, the final convolution to get to 4 x 200 x 500 x 500, crossentropy, and the backpropagation should only take maybe 10 GB, so cram your architecture into the remaining 30GB
for example, takes 7.5 GB.