Submitted by shingekichan1996 t3_10ky2oh in MachineLearning
melgor89 t1_j5u766t wrote
There is a great paper about analyzing batch size vs accuracy correlation. They propose loss function, which is able to learn SimClr on bs=256 instead of 4k. So, there is some research in this domain. https://arxiv.org/abs/2110.06848
shingekichan1996 OP t1_j5wlavz wrote
I saw an implementation of that paper here: https://github.com/raminnakhli/Decoupled-Contrastive-Learning
And I saw also that the same paper is rejected at NeurIPS'21 becuase of its similar impact on other methods like Barlow Twins, SimSiam, BYOL, etc.
However, at first glance at the re-implemented results, it works great on small batch-size indeed.
Viewing a single comment thread. View all comments