jacobgorm
jacobgorm t1_izkpyw6 wrote
Reply to [D] When to use 1x1 convolution by Ananth_A_007
1x1 conv allows you to connect a set of input activations to a set of outputs. In Mobilenet v1/v2 this is necessary because the 3x3 convs are done separately for each channel, with no cross-channel information flow, unlike in a normal full 3x3 conv where information is able to flow freely across all channels.
In this way, you can view the separable 3x3 as a simple spatial gathering step whose main purpose is to grow the receptive field, and the 1x1 as the place that most of the work happens. It has been shown that you can leave out the 3x3 convolution ENTIRELY and do everything in the 1x1, as long as you are gathering the data in a way that grows the receptive field, e.g., see https://openaccess.thecvf.com/content_cvpr_2018/papers/Wu_Shift_A_Zero_CVPR_2018_paper.pdf .
However, the Mobilenet approach just makes more sense in practice, because if you are going to be reading the data you may as well compute on them and bias/bn+activate the result while you have them loaded into CPU or GPU registers.
jacobgorm t1_j3nigl3 wrote
Reply to comment by suflaj in [D] Why is Vulkan as a backend not used in ML over some offshoot GPU specification? by I_will_delete_myself
Being cross-platform and not tied to a single vendor's hardware would be a great plus. Vulkan Compute is for general purpose compute not graphics.