Submitted by Dense_History_1786 t3_z18xz0 in MachineLearning
Hi, I would like to use stable diffusion as part of a side project, I have it currently deployed on a vm in google cloud, but its not scalable. How can I deploy it so that its scalable (similar to aws lambda but with gpu)?
thundergolfer t1_ix9zysc wrote
> How can I deploy it so that its scalable ?
There's no such general thing as "scalability" (AKA magic scaling sauce). You'll have to be a lot more specific about how your deployment is not handling changes in load parameters.
If I had to guess, I'd say the likely scaling issue is going from a single VM with a single GPU to
N
GPUs able to run inference in parallel.If that is your main scaling issue, modal.com can do serverless GPU training/inference against
N
GPUs almost trivially: twitter.com/charles_irl/status/1594732453809340416.(disclaimer: work for modal)