gettheflyoffmycock t1_j1bzvsv wrote on December 23, 2022 at 4:23 AM

I’ve had to deploy a lot of deep learning, there will not be a simple easy slap on deployment of something like this. Furthermore, it is not going to be cheaper. First of all, I’m not sure if it requires a graphics card, but in AWS there is a one hour minimum unless you use a more expensive contract. So when you make a API request, it’s going to charge you the full three dollar minimum or up to $20 depending on what instance you are using.

Furthermore, the cold start time. If you have it shut down when not in use its like at least 5 to 10 minutes for a model of this size to get up and running. The only way this is cost-effective is if it can run on CPU only, it could fit on an extremely cheap or free AWS. But my guess is that models like this are not going to be able to run fast enough to make it worth it with only CPU.

can anyone chime in if state of the art text generation models like this can run on CPU only?

maxToTheJ t1_j1c75bp wrote on December 23, 2022 at 5:31 AM

You are 100% right. However people will do like DALL-E and make a budget mickey mouse version and pretend its the exact same thing without measuring any quantitative metrics between the original implementation and theirs.

gettheflyoffmycock t1_j1chv31 wrote on December 23, 2022 at 7:33 AM

Yeah, funny how many people have been advertising on all the machine learning subreddits their new chat GPT application. Which is funny because Chat GPT doesn’t have a single API yet.

Kinda funny, AI is ending up like drop shipping. the art of advertising shitty AliExpress products as if they’re actually a better product, and then up charge people like 500 or 1000%, then you just order the AliExpress product and have it mailed to their house. It’s like people are doing that with AI now. Just say it’s this or that and then put a super lightweight model like OpenAI Davinci on a free AWS instance and call it chat GPT. Business models built on “If da Vinci charges you four cents per API credit just charge the user eight Cents “ what will they know?

[deleted] t1_j1cxft5 wrote on December 23, 2022 at 11:01 AM

[removed]

mrcschwering t1_j1easrz wrote on December 23, 2022 at 5:51 PM

I have only deployed a few models (smaller BERT-like) and was able to fit some of them into Lambda function (load from S3).

Otherwise, if we don't care about start-up time, a lambda function that starts a spot instance.