IntelArtiGen t1_izqc26r wrote on December 11, 2022 at 1:31 AM

Reply to comment by Ananth_A_007 in [D] When to use 1x1 convolution by Ananth_A_007

The information you have before a layer is conditioned by how it goes into that layer, at first the information that goes into that layer is noise, weights change depending on the loss such that when the information goes into that layer it reduces the loss, and becomes something meaningful.

So the question would be: is it better for information processing in the neural network to compare 2x2 values and take the max? or is it better to train the network such that it can put the correct information in 1 of the 2x2 values and always keep that one?

I think the answer depends on the dataset, the model and the training process.

And I think the point of that layer isn't necessarily to look at everything but just to shrink dimensions without loosing too much information. Perhaps looking at everything is not required to keep enough information.