Submitted by fintechSGNYC t3_1095os9 in MachineLearning
starstruckmon t1_j3wkcvm wrote
More important question is what does OpenAI bring to the table that can't be found elsewhere?
It doesn't cost 10B to train a language model of that scale. There's no network effect like with a search engine or social media. OpenAI doesn't have access to some exclusive pile of data ( Microsoft has more of that proprietary data than OpenAI ). OpenAI doesn't have access to some exclusive cluster of compute ( Microsoft does ). There isn't that much proprietary knowledge exclusive to OpenAI. Microsoft wouldn't be training a language model for the first time either. So what? Just an expensive acquihire?
Hyper1on t1_j3wp270 wrote
Why were OpenAI the first to make a model as good as ChatGPT then? It seems clear there is a significant talent and experience advantage in this. I should also mention that no company other than OpenAI has the same quantity of data on human interactions with large language models, thanks to the past 2 and a half years of the OpenAI API.
starstruckmon t1_j3wsh74 wrote
>Why were OpenAI the first to make a model as good as ChatGPT then?
That's a good question. OpenAI definitely is more open to allowing the public access to these models than other companies. While OpenAI isn't as open as some would like, they have been better than others. OpenAI might have pioneered some things but the problem is those aren't proprietary. They have published enough for others to replicate.
>It seems clear there is a significant talent and experience advantage in this.
If they can hold on to that talent. Not everyone there is gonna stick around. For eg. a lot of the GPT3 team went over to start Anthropic AI, which already has a competitor in beta.
>I should also mention that no company other than OpenAI has the same quantity of data on human interactions with large language models, thanks to the past 2 and a half years of the OpenAI API.
This is a good point. But is really better than the queries Microsoft has through Bing or Google through their search? Maybe, but still feels like little for 10B. Idk.
[deleted] t1_j3xcyt7 wrote
[deleted]
visarga t1_j419sn0 wrote
MS failed the search, abandoned the browser, missed the mobile, now they want to hit. It's about not fucking up again.
I don't think the GPT-3 model itself is a moat, someone will surpass it and make a free version soon enough. But the long term strategy is to become a preferred hosting provider. In a gold rush, sell shovels.
sockcman t1_j3wvzc9 wrote
Because the other big player (Google) didn't care enough / see the value. Google could snap their fingers and have chat gpt if they wanted. Google invented the model that gpt uses.
bouncyprojector t1_j3xk059 wrote
Except that Google publishes their research in detail and OpenAI doesn't. It's not clear how OpenAI has modified the GPT architecture/training other than some vague statement about using human feedback. Small changes can make a big difference and we don't really know what they've done.
All-DayErrDay t1_j3xtdbw wrote
Completely agree and that’s the difference that matters the most. Can’t always buy the most important things like talent. And hiding your research gains means you could have a lot of insights no one else has.
FruityWelsh t1_j400amk wrote
https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html
Here is the model I keep seeing as the next step past ChatGPT.
42gether t1_j40fxyg wrote
> Why were OpenAI the first to make a model as good as ChatGPT then?
Here's a controversial take: luck
They didn't invent the wheel or faster than light travel, it was something that was going to happen sooner or later and they were just the first to do it publicly, meanwhile Google fired a guy that mass mailed people saying their own ai was sentient.
visarga t1_j41aj3a wrote
> meanwhile Google fired a guy that mass mailed people saying their own ai was sentient.
Never imagined it would turn out so bad for Google to need Lemoine's testimony
kulchacop t1_j3wlnj3 wrote
Maybe they wanted to capitalise on the name. ChatGPT has become synonymous to conversational language models in non tech circles, both in corporate and popular culture.
yaosio t1_j3wuvpd wrote
It's easier for Microsoft to invest in or buy another company than create their own stuff from scratch.
starstruckmon t1_j3wxccu wrote
True and that's probably the reason. But still, they have a ML/AI division. Why not have them just train Megatron to convergence and leapfrog GPT3? I'll never understand how these companies make decisions honestly.
m98789 t1_j3x653d wrote
The three main AI innovation ingredients are: talent, data, and compute. Microsoft has all three, but of them all, at the world-class level, top talent is the most scarce. Microsoft has amazing talent in MSR but it is spread into multiple areas and has different agendas. OpenAI talent is probably near/on par with MSR talent, but has focus and experience and a dream team dedicated to world-class generative AI. They will be collaborating with MSR researchers too, and leveraging the immense compute and data resources at Microsoft.
starstruckmon t1_j3x6dzj wrote
Fair enough.
erelim t1_j3x3pl1 wrote
Everyone is currently behind openAI even Google who likely considers this existential risk. If you were Google/MS would you rather buy and become the leader and their talent or let the competitor buy them, thinking you can build something from behind to overtake the leader. The latter is possible but riskier than the first
starstruckmon t1_j3x4bsr wrote
How is Google behind OpenAI? Chinchilla has simmilar performance as GPT3 yet is much cheaper to run since it has less than half the parameters.
visarga t1_j41avq0 wrote
Many smaller models give good results on classification and extractive tasks. But when they need to get creative they don't sound so great. I don't know if Chinchilla is as creative as the latest from OpenAI, but my gut feeling says it isn't.
starstruckmon t1_j41dgsk wrote
There's no way for us to tell for certain, but since Google has used it for creativity oriented projects/papers like Dramatron, I don't think so. I feel the researchers would have said something instead of leading the whole world intentionally astray as everyone is now following Chinchilla's scaling laws.
Chinchilla isn't just a smaller model. It's adequately trained unlike GPT3 which is severely undertrained, so simmilar, if not exceeding ( as officially claimed ), capabilities isn't unexpected.
m98789 t1_j3wtx3g wrote
I think you may be underestimating the compute cost. It’s about $6M of compute (A100 servers) to train a GPT-3 level model from scratch. So with a billion dollars, that’s about 166 models. Considering experimentation, scaling upgrades, etc., that money will go quickly. Additionally, the cost to host the model to perform inference at scale is also very expensive. So it may be the case that the $10B investment isn’t all cash, but maybe partially paid in Azure compute credits. Considering they are already running on Azure.
All-DayErrDay t1_j3xttzb wrote
500k, actually (per MosaicML). Will likely drop to 100k soon with H100s being several times faster. Would probably be even lower if you added every efficiency gain currently available.
m98789 t1_j3xxyvm wrote
You are right that the trend is for costs to go down. It was originally reported that it took $12M in compute costs for a single training run of GPT-3 (source).
H100s will make a significant difference and all the optimization techniques. So I agree prices will drop a lot, but for the foreseeable future, still be out of reach for mere mortals.
starstruckmon t1_j3wvdqt wrote
>I think you may be underestimating the compute cost. It’s about $6M of compute (A100 servers) to train a GPT-3 level model from scratch. So with a billion dollars, that’s about 166 models.
I was actually overestimating the cost to train. I honestly don't see how these numbers don't further demonstrate my point. Even if it cost a whole billion ( that's a lot of experimental models ), that's still 10 times less than what they're paying.
>Considering experimentation, scaling upgrades, etc., that money will go quickly. Additionally, the cost to host the model to perform inference at scale is also very expensive. So it may be the case that the $10B investment isn’t all cash, but maybe partially paid in Azure compute credits. Considering they are already running on Azure.
I actually expect every last penny to go into the company. They definitely aren't buying anyone's shares ( other than maybe a partial amount of employee's vested shares ; this is not the bulk ). It's mostly for new shares created. But $10B for ~50% still gives you a pre-money valuation of ~10B. That's a lot.
Non-jabroni_redditor t1_j3yr9cx wrote
Time. The answer is time and risk for why they are spending 10x.
They can spend the next however many years attempting to build a model that is like gpt but is entirely possible it’s just not as good after all of that. The other option is pay a premium with money they have for a known product.
slashd t1_j3x8sn1 wrote
>what does OpenAI bring to the table that can't be found elsewhere?
First to a winner-takes-all market?
Microsoft was 3rd in the mobile market and they eventually had to give it up. Now they're first in this new market.
starstruckmon t1_j3x9azk wrote
I'm not sure this really is a winner takes all market but maybe. Good point.
Viewing a single comment thread. View all comments