melgor89 t1_j5u766t wrote on January 25, 2023 at 4:30 PM

There is a great paper about analyzing batch size vs accuracy correlation. They propose loss function, which is able to learn SimClr on bs=256 instead of 4k. So, there is some research in this domain. https://arxiv.org/abs/2110.06848

shingekichan1996 OP t1_j5wlavz wrote on January 26, 2023 at 1:42 AM

I saw an implementation of that paper here: https://github.com/raminnakhli/Decoupled-Contrastive-Learning

And I saw also that the same paper is rejected at NeurIPS'21 becuase of its similar impact on other methods like Barlow Twins, SimSiam, BYOL, etc.

However, at first glance at the re-implemented results, it works great on small batch-size indeed.