Submitted by sabeansauce t3_yfjfkh in deeplearning
Long_Two_6176 t1_iu5omuu wrote
This is called model parallelism. Think of this as having model.conv1 on gpu1 and model.conv2 on gpu2. This is actually not too hard to do as you just need to manually specify your model components with statements like .to(“cuda:”). Start with this.
A more advanced model is model parallelism + data parallelism where you can benefit from having both gpus split the dataset to accelerate the training. Typically this is not possible with simple model parallelism, but an advanced model like fairseq can do it for you.
the_hackelle t1_iu5qdub wrote
Also because it's super user friendly and easy to implement, have a look at pytorch lightning. They make distributing and such very easy
sabeansauce OP t1_iu67f3z wrote
woah that is a cool project.
sabeansauce OP t1_iu676zn wrote
okay I could see how I was thinking about it kind of wrong. Thanks for the reply
Viewing a single comment thread. View all comments