BITE_AU_CHOCOLAT t1_ithud6t wrote on October 23, 2022 at 7:29 PM

Reply to comment by Purple_noise_84 in [D] Building the Future of TensorFlow by eparlan

Eh... I'm currently training a model with 700M parameters (most of which being in the emdeddings used as input, not so much the hidden layers themselves) and Pytorch pretty much required at least 50GB per GPU while Tensorflow was happy to train on 3090s, which were way, wayyyy cheaper to rent than a6000s even though Pytorch managed better GPU utilization. So I think I'm just gonna stick with TF/Keras and TFLite for now.

learn-deeply t1_itikofd wrote on October 23, 2022 at 10:17 PM

PyTorch doesn't inherently use more or less memory than TensorFlow, there's a bug in your code. If it's easier to switch frameworks than debug, more power to you.

BITE_AU_CHOCOLAT t1_itiofjz wrote on October 23, 2022 at 10:43 PM

Well, I haven't "switched" since I've been using Tensorflow since the start of the project. I was just curious to see if Pytorch could allow me to squeeze more juice and after spending a weekend trying to learn ~~assembly~~ Pytorch syntax it turns out that yes, but actually no. So yeah I'm perfectly content with using model.fit and calling it a day for the time being.

Oh and I also forgot: Pytorch won't train with a distributed strategy in a Jupyter environment. KEK.