Submitted by kaphed t3_124pbq5 in MachineLearning

Looking at some old tables:

https://arxiv.org/pdf/1512.03385.pdf, Table 4

https://arxiv.org/pdf/1905.11946.pdf, Table 2

Why do the ResNet-152 results vary? E.g. Top-1 error on ImageNet validation set is 19.38 in the original, but 22.2 in the EfficientNet paper.

Normally I would assume these type of results would be copied from the previous publication.

2

Comments

You must log in or register to comment.

suflaj t1_je1uvo8 wrote

They probably redid the experiments themselves. Also, ResNets had some changes shortly after release I believe, and they could have used different pretraining weights. AFAIK He et al. never released their weights.

Furthermore, Wolfram and PyTorch pretrained weights are also around 22% top-1 error rate, so that is probably the correct error rate. Since PyTorch provides weights that reach 18% top-1 error rate with some small adjustments to the training procedure, it is possible the authors got lucky with the hyperparameters, or employed some techniques they didn't describe in the paper.

2

U03B1Q t1_je4xekj wrote

https://dl.acm.org/doi/10.1145/3324884.3416545

There was an ASE paper that found that even under identical hyperparameter seed settings networks had a variance of about 2% due to non-determinism in the parallel computing workflow. If they chose to retrain it instead of copying the old numbers, this performance discrepancy is in line with this work.

1

MOSFETBJT t1_je3xbo7 wrote

Probably batch size differences?

0