manOnPavementWaving
manOnPavementWaving t1_j4fi6f0 wrote
Reply to comment by Nill444 in So wait, ChatGPT can code... But can't code itself? by Dan60093
Mediocre first year IT students can do that. But no way it's writing an efficient flash attention kernel without having seen one before.
manOnPavementWaving t1_j42wiwx wrote
Reply to comment by suflaj in [D] Is there a distilled/smaller version of CLIP, or something similar? by alkibijad
Ehm, CLIP actually has a resnet50 version. Its still too big, tho.
manOnPavementWaving t1_j3x2wca wrote
Reply to comment by [deleted] in [News] "Once $92 billion in profit plus $13 billion in initial investment are repaid (to Microsoft) and once the other venture investors earn $150 billion, all of the equity reverts back to OpenAI." by Gmroo
What? Amazon made 9 billion from Q42021 to Q32022
manOnPavementWaving t1_j3cfzq2 wrote
Reply to comment by LesleyFair in [N] 7 Predictions From The State of AI Report For 2023 â• by LesleyFair
That is just voodoo accounting, all that money is from google.
manOnPavementWaving t1_j3c15kd wrote
Isnt 6 just google to deepmind every year? Or does that not count?
manOnPavementWaving t1_j2zbwee wrote
Reply to comment by GoldenRain in 2022 was the year AGI arrived (Just don't call it that) by sideways
I agree, but your arguments are soso. It adopts on the fly through in-context learning, learning continuously is just a way of implementation, and it actually does have a decent understanding of cause and effect.
manOnPavementWaving t1_j25doir wrote
Reply to comment by Foundation12a in For those of you who expect AI progress to slow next year: by Foundation12a
Building on YEARS of ideas. They were cool, but without transformers they wouldn't exist. Without infrastructure code, they wouldn't exist. Without years of hardware improvements, they wouldn't exist. Without the ideas of normalization and skip connections, they wouldn't exist. Etc. (and this isn't even including all the alleys that were chased down, to find out they didn't work. Which isn't as clear, but definitely contributes to research).
GATO didn't even have that much to show for it, the long hoped-for skill transfer was not really there. DALLE 2 builds on CLIP and diffusion, ChatGPT builds on GPT3 and years of RL research.
You're saying something along the lines of "x is better than what came before, so the step to x is bigger than the sum of all the steps before that" and that is the worst take i've ever heard. It's definitely not how research works.
And goddamn it I'm getting deja vu cuz this bad take has been said before on this subreddit.
This rebuttal better? I'd be happy to go and list essential moments in AI in the past decade if it isn't.
manOnPavementWaving t1_j1q0sy5 wrote
Reply to Sam Altman Confirms GPT 4 release in 2023 by Neurogence
Not saying you're wrong, cuz I also don't see how GPT4 won't be released in 2023, but most wrong predictions are a result of "reading between the lines", and reading into it what you wanna hear
manOnPavementWaving t1_j1ahhh5 wrote
Reply to if search engines all become chatbots, how will u find new websites? is that secretly the entire point!? by petermobeter
Have you ever seen the current google suggestions? Helpful, but don't completely replace the entire results section? Yeah, it's gonna be like that. Partly because it will be necessary when the chatbot fails, and mostly because people won't trust the system otherwise
manOnPavementWaving t1_j04bxs9 wrote
Reply to comment by ChronoPsyche in Is it just me or does it feel like GPT-4 will basically be game over for the existing world order? by Practical-Mix-4332
I agree that you can't extrapolate, but it's definitely not the case that GPT4 has to have the same limitations as GPT2 and GPT3. Context window issues can be resolved in a myriad of ways (my current fav being this one and retrieval based methods could solve most of the factuality issues (and are very effective and cheap models, as proven by RETRO).
So I want to re-emphasize that we have no clue how good it will be. It could very well smash previous barriers, but it could also be rather disappointing and very much alike ChatGPT. We just don't know.
manOnPavementWaving t1_izea9bo wrote
Reply to comment by blueSGL in ChatGPT will put us all out of jobs soon enough! by fayad-k
There are easy and cheap ways to give models extra memory
manOnPavementWaving t1_iyo3sqa wrote
Reply to comment by Imaginary_Ad307 in Have you updated your timelines following ChatGPT? by EntireContext
This doesn't hold for the networks currently in use, only if we want to more closely simulate human brains. There is no real indication yet that we can train these better or that they work better.
manOnPavementWaving t1_iw0c7h3 wrote
Reply to comment by TemetN in The CEO of OpenAI had dropped hints that GPT-4, due in a few months, is such an upgrade from GPT-3 that it may seem to have passed The Turing Test by Dr_Singularity
Most news around GPT-4 is. The "Cerebras partnership" was always just a mention of GPT-N by Cerebras to hype up their processor, OpenAI had no say in that. (also not sure if .e6 is 100k-1M or 1M-10M). The only leak that Im sure came from Sam was "the model won't be much bigger than GPT-3 and be text only", which Id say is the only one to trust (although it can be outdated).
manOnPavementWaving t1_itz2ds5 wrote
Reply to comment by Qumeric in Where does the model accuracy increase due to increasing the model's parameters stop? Is AGI possible by just scaling models with the current transformer architecture? by elonmusk12345_
I agree, but you'll find yourself to be a stranger in this thread
manOnPavementWaving t1_ityolvz wrote
Reply to comment by porcenat_k in Where does the model accuracy increase due to increasing the model's parameters stop? Is AGI possible by just scaling models with the current transformer architecture? by elonmusk12345_
They actually do invent tools, but that's not the important thing. What made humans intelligent is having a big brain, and having lots of time. If we were to put a newborn and a baby chimpanzee in a jungle and monitor them, they wouldn't seem all that different regarding intelligence.
Fine if you take that into your calculations, but it can't be attributed to just the bigger brain. Problem being, the 100 trillion parameter model won't have hundreds of thousands of years, and billions of copies of itself.
Cool reference, though! Interesting work
manOnPavementWaving t1_itwe8z0 wrote
Reply to Current state of quantum computers by ryusan8989
https://youtu.be/SORSZ9Je-8g you'll wanna watch this guy's updates each year
but in short, nothing extremely useful is gonna come of these computers in the coming 5-10 years
manOnPavementWaving t1_itudq0y wrote
Reply to comment by porcenat_k in Where does the model accuracy increase due to increasing the model's parameters stop? Is AGI possible by just scaling models with the current transformer architecture? by elonmusk12345_
What study shows the equivalence of neural network parameters and connections in the brain? What calculations did you do to to get to "a billion times more intelligent"?
manOnPavementWaving t1_itt8bt1 wrote
Reply to comment by porcenat_k in Where does the model accuracy increase due to increasing the model's parameters stop? Is AGI possible by just scaling models with the current transformer architecture? by elonmusk12345_
All I see is comparisons to humans that are by and large unfounded.
manOnPavementWaving t1_itt6vrn wrote
Reply to comment by porcenat_k in Where does the model accuracy increase due to increasing the model's parameters stop? Is AGI possible by just scaling models with the current transformer architecture? by elonmusk12345_
That wasn't 1 year before the prediction of a hundred billion parameters though. Im not doubting that they'll come, im doubting the timeline.
Interested in why you think a 10 trillion parameter would be human level AGI.
manOnPavementWaving t1_itt6ptg wrote
Reply to comment by TopicRepulsive7936 in Where does the model accuracy increase due to increasing the model's parameters stop? Is AGI possible by just scaling models with the current transformer architecture? by elonmusk12345_
Its an implicit no in the sense that scaling is already slowing
manOnPavementWaving t1_itt06eo wrote
Reply to comment by porcenat_k in Where does the model accuracy increase due to increasing the model's parameters stop? Is AGI possible by just scaling models with the current transformer architecture? by elonmusk12345_
With H100 the training time optimistically only improves a factor of 9. Not nearly enough to breach the 200x gap between the current largest model and 100 trillion parameter model, and thats in parameter scaling alone, ignoring data scaling. PaLM training took 1200 hours on 6144 tpu v4 chips, and an additional 336 hours on 3072 tpu v4 chips. A 100 trillion parameter model would literally be too big to train before the year 2023 comes to an end.
manOnPavementWaving t1_itsz25o wrote
Reply to comment by porcenat_k in Where does the model accuracy increase due to increasing the model's parameters stop? Is AGI possible by just scaling models with the current transformer architecture? by elonmusk12345_
Wowowow you're seriously questioning the scaling laws of deepmind and going back to the OpenAI ones, which have been demonstrated to be false?
Chain of thought prompting, self consistency, reinforcement learning from human feedback, and data scaling, that's been driving LLM performance lately, noticeably more than scale has. (whilst being significantly cheaper).
Why do you expect such a jump when the industry has been stuck at half a trillion for the past year? All previous jumps were smaller and cost significantly less.
manOnPavementWaving t1_itsn0zt wrote
Reply to Where does the model accuracy increase due to increasing the model's parameters stop? Is AGI possible by just scaling models with the current transformer architecture? by elonmusk12345_
Its actually already stopping, the engineering challenges are getting too big (trends predict 5-10 trillion parameter dense models by now, bet your ass they don't exist), the data available is getting too few, and the other ways to increase performance are way too easy and way too cheap to not focus on.
manOnPavementWaving t1_itkax8m wrote
Reply to how old are you by TheHamsterSandwich
damn, no teens?
manOnPavementWaving t1_j4g3avm wrote
Reply to comment by Nill444 in So wait, ChatGPT can code... But can't code itself? by Dan60093
I am oneðŸ˜