Viewing a single comment thread. View all comments

TiredOldCrow t1_its89ot wrote

Since you're using different pre-trained VGG16 models as a starting point, you may just be demonstrating that the PyTorch torchvision model is more amenable to your combination of hyperparameters than the TensorFlow one.

Ideally for this kind of comparison you'd use the exact same pretrained model architecture+weights as a starting point. Maybe look for a set of weights that has been ported to both PyTorch and TensorFlow?

41

seba07 t1_ittpya9 wrote

Or otherwise don't use a pre-trained network for this test. Pytorch randomness shouldn't be better than Tensorflows.

7

aleguida OP t1_itvgiu8 wrote

Thanks for the feedback. I tried retraining everything from scratch without downloading any pretrained weights. here is the colab links update.

​

While Pytorch is learning something, Tf is not learning anything. This is actually quite confusing as I used tf.Keras to minimize any possible error on my part. I will try to build the same network from scratch in both frameworks next

1