Submitted by Outrageous_Room_3167 t3_zpw2ew in deeplearning
Hey new to infrastructure builds, we're a small start-up https://www.axibo.com/ curious about what the biggest 3090 deep learning rigs people have done.
How do we scale past one machine? My guess is a very fast direct connection across the machines, is this feasible with the 3090s?
The cost of these has gone down dramatically & per unit basis, almost as good as A100.
VinnyVeritas t1_j0w0gvh wrote
I'm not following: you're doing start-up on infrastructure build and you have to ask for advice on reddit to scale past 1 machine? That gives a terrible image of your startup. To the average person like me it sounds like you don't know what you're doing.