Viewing a single comment thread. View all comments

Nhabls t1_j6xemzb wrote

GPT-3 didn't cost a billion to train

It does cost a LOT of money to run, which is why you're unlikely to "see better" for the short and medium term future. Unless you're into paying hundreds to thousands per month for this functionality

26

cthorrez t1_j6xg1ke wrote

Microsoft paid 1B to use GPT3.

19

Nhabls t1_j6xhm7v wrote

I don't think the billion was for gpt alone, it was to build out an entire AI ecosystem within azure and a big chunk of it was handed out as azure credits anyway

29

bokonator t1_j6yecgt wrote

Microsoft recently paid 10B$ to get full access to the model and allow openAI full access to Azure GPUs and a 49% ownership.

7

Nhabls t1_j6ymdmr wrote

The 10 Billion dollar deal is, reportedly, giving microsoft 75% of OpenAI's profits until a certain threshold, that's more than just any given model

16

anananananana t1_j715cp1 wrote

Wow, OpenAI indeed. They couldn't have gone more against the original intention of democratizing AI if they tried.

2

DM-me-ur-tits-plz- t1_j73n2dw wrote

When they originally went closed-source they claimed it was because of the dangers that being open-sourced presented.

About a year later they dropped their non-profit status and sold out to Microsoft.

Love the company, but that's some crazy double speak there.

2

AristosTotalis t1_j6ye5hn wrote

yep. $1B in cash but they have to use Azure as their exclusive compute cloud compute provider, which Microsoft probably sells to OAI at ~cost

I think it' safe to assume that 2/3 of that will go towards training & inference, and if you also assume M doesn't make nor lose money selling compute (and in fact they get to strengthen Azure as a cloud infra player), they really only paid ~$300M to invest in OAI at what seems like a great price in hindsight

6

Nhabls t1_j6ymqwr wrote

Well OpenAI also, in that scenario, got a massive on demand compute infrastructure at cost, that's a good deal both ways.

8

LetterRip t1_j6yj4z2 wrote

GPT-3 can be quantized to 4bit with little loss, to run on 2 Nvidia 3090's/4090's (Unpruned, pruned perhaps 1 3090/4090). At 2$ a day for 8 hours of electricity to run them, and 21 working days per month. That is 42$ per month (plus amortized cost of the cards and computer to store them).

3

Nhabls t1_j6ymx0w wrote

I seriously doubt they have been able to do what you just described.

Not to mention a rented double gpu setup, even the one you described would run you into the dozen(s) of dollars per day, not 2.

7

cunth t1_j71ocf6 wrote

Not sure about the above claim, but you can train a GPT2 model in 38 hours for about 600 bucks on rented hardware now. Costs are certainly coming down.

1

TheTerrasque t1_j6ybrk0 wrote

Well, you got deepmind's chinchilla model, and Google's CALM approach that can increase the speed of interference by maybe 3x - in addition to other tricks..

1