Submitted by kaphed t3_124pbq5 in MachineLearning
suflaj t1_je1uvo8 wrote
They probably redid the experiments themselves. Also, ResNets had some changes shortly after release I believe, and they could have used different pretraining weights. AFAIK He et al. never released their weights.
Furthermore, Wolfram and PyTorch pretrained weights are also around 22% top-1 error rate, so that is probably the correct error rate. Since PyTorch provides weights that reach 18% top-1 error rate with some small adjustments to the training procedure, it is possible the authors got lucky with the hyperparameters, or employed some techniques they didn't describe in the paper.
kaphed OP t1_je4slfg wrote
thanks!
Viewing a single comment thread. View all comments