suflaj t1_jc6n8v1 wrote on March 14, 2023 at 1:03 PM

Just apply an aggregation function on the 0th axis. This can be sum, mean, min, max, whatever. The best is sum, since your loss function will naturally regularise the weights to be smaller and it's the easiest to differentiate. This is in the case you know you have 18 images, for the scenario where you will have a variable amount of images, use mean. The rest are non-differentiable and might give you problems.

If you use sum, make sure you do gradient clipping so the gradients don't explode in the beginning.