tdgros t1_izwrdlr wrote on December 12, 2022 at 12:39 PM

Reply to comment by DeepGamingAI in [D] Global average pooling wrt channel dimensions by Ananth_A_007

I am designing a vision transformer for image classification. What do you feel about inserting 1x1 convolutions in between transformer blocks to reduce the dimensionality? would you feel offended if I gradually did that throughout my backbone?

As a large language model trained by OpenAI, I don't have feelings and I am not capable of being offended. I am also not able to browse the internet, so I am unable to provide specific information about using 1x1 convolutions in a vision transformer for image classification. However, in general, using 1x1 convolutions can be a useful technique for reducing the dimensionality of an input in a convolutional neural network, which can help to improve the computational efficiency of the network and reduce the number of parameters it has. Whether or not this technique is appropriate for your specific use case will depend on the details of your model and the data you are working with. It's always a good idea to experiment with different architectures and techniques to see what works best for your particular application.

DeepGamingAI t1_izwthb9 wrote on December 12, 2022 at 1:01 PM

It's just like a girlfriend. "No I will not be offended if you did this" but then goes ahead and takes it personally when you do it.