theLastNenUser t1_j9uwhcd wrote
Reply to comment by Desticheq in [P] What are the latest "out of the box solutions" for deploying the very large LLMs as API endpoints? by johnhopiler
You will have to message them if you want to use the larger GPU boxes, and the autoscaling isn’t great for larger models. The customizability of the “handler.py” file is nice though
Desticheq t1_j9xiv9l wrote
Well, in terms of "out-of-the-box," I'm not sure what else could be better. AWS, Azure or Google provide empty units basically, and you'd have to configure all the "Ops" stuff like network, security, load balancing, etc. That's not that difficult if you do it once in a while, but for a "test-it-and-forget-it" project it might be too difficult.
Viewing a single comment thread. View all comments