nwatab

nwatab t1_iy75dep wrote

I was training 10GB dataset on AWS ec2 (AMI: Deep Learning AMI GPU TensorFlow 2.10.0 (Amazon Linux 2) 20221116). After half an epoch, ec2 is very slow due to lack of memory. Does anyone know why? I don't understand why "after about half an epoch (around less than 10 minutes)", it gets slow, instead of the beginning of training.

1