theLastNenUser t1_j9uwhcd wrote on February 24, 2023 at 7:01 PM

Reply to comment by Desticheq in [P] What are the latest "out of the box solutions" for deploying the very large LLMs as API endpoints? by johnhopiler

You will have to message them if you want to use the larger GPU boxes, and the autoscaling isn’t great for larger models. The customizability of the “handler.py” file is nice though

Desticheq t1_j9xiv9l wrote on February 25, 2023 at 7:20 AM

Well, in terms of "out-of-the-box," I'm not sure what else could be better. AWS, Azure or Google provide empty units basically, and you'd have to configure all the "Ops" stuff like network, security, load balancing, etc. That's not that difficult if you do it once in a while, but for a "test-it-and-forget-it" project it might be too difficult.