MinotaurOnLucy t1_j6fpaoj wrote on January 30, 2023 at 12:26 AM

Reply to comment by suflaj in Why did the original ResNet paper not use dropout? by V1bicycle

Don’t they have two different purposes? As I understand it: The batchnorm is used to maintain activations along deep neural networks so that non linear activations do not kill the neurons whose probability distributions would have flattened out while a dropout is only meant to train a network uniformly to prevent overfitting.