Comments
[deleted] t1_j56r963 wrote
[deleted]
challengethegods t1_j58jrmu wrote
People don't know that they don't know what they don't know intensifies.
yolkedbuddha t1_j5e0chy wrote
This is an important point.
[deleted] t1_j55l62l wrote
[deleted]
genshiryoku t1_j55pz1h wrote
Scaling up transformer models like GPT aren't going to result in AGI. Almost every AI expert including the researchers working at OpenAI agree with this.
We need a new architecture, one with both short term and long term memory, multi-modality and less need for training data for us to reach AGI.
The current path of scaling up transformer models will stagnate at GPT-4 or GPT-5 because we simply don't have enough data on the collective internet for us to keep scaling it further than that.
MrEloi t1_j55svcd wrote
>we simply don't have enough data on the collective internet for us to keep scaling it further than that.
Why do we need more data? We already have a lot.
We now need to work on the run-time aspects more e.g. short and long term memories etc.
ElvinRath t1_j56gy3g wrote
Either we need new architectures or more data.
​
Right now, even if somehow we put youtube into text, wich could be done, there is just not enought data to efficiently train a 1T parameters model.
And just in text form, there is probably not enought even for 300B....
​
So, yeah, there is no enought data. to keep scaling up
​
It Might be different with multimodal, I don't know about that.
genshiryoku t1_j55te6y wrote
Because GPT-3 was trained on almost all publicly available data and GPT-4 will be trained by transcribing all video footage on the internet and feeding it to it.
You can't scale the model up without scaling the training data with it. The bottleneck is the training data and we're running out of it.
It's not like the internet is suddenly going to 10x in size over the next couple of years. Especially as the global population is shrinking and most people are already connected online so not a lot of new data is made.
Surur t1_j566uy8 wrote
The next step is real-time experiential data, from live video cameras, robot bodies, self-driving cars.
genshiryoku t1_j568vwe wrote
That's not a whole lot of data and doesn't compare to the gargantuan amount of data already on the decade generated over decades.
The current transformer model scaling will hit a wall soon due to lack of training data.
Clawz114 t1_j56nnoj wrote
>Because GPT-3 was trained on almost all publicly available data
GPT-3 was trained with around 45TB of data, which is only around 10% of the common crawl database that makes up 60% of GPT3's training dataset.
>Especially as the global population is shrinking and most people are already connected online so not a lot of new data is made.
The global population is growing and expected to continue growing until just over the 10 billion mark?
Gohoyo t1_j57azvh wrote
> It's not like the internet is suddenly going to 10x in size over the next couple of years. Especially as the global population is shrinking and most people are already connected online so not a lot of new data is made.
I don't get this. Can't AI generate more data for itself in like, a year, than all human communications since the dawn of the internet? Why would the internet need to 10x in size if the population gets a hold of AI that increases the amount of content generated by x1000? Seems like you just need an AI that generates a fuck ton of content and then another one that determines what in that content is "quality". I am totally ignorant here, I just find the 'running out of data' thing quite strange.
genshiryoku t1_j57bbc9 wrote
You can't use AI generated data to train AI as essentially they are already from their dataset. Training with synthetic data like that is called "overfitting" and reduces the performance and effectiveness of the AI.
Gohoyo t1_j57c8yb wrote
Does this mean it only learns from novel information it takes in? As in it can never learn anything about cat conversations after the 10th conversation it reads about a cat? I mean what's the difference between it reading about something it made versus reading someone a person wrote that says something similar? I just can't figure out how you can't get around this by using AI somehow.
Like: AI A makes a billion terabytes of content.
AI B takes in content and makes it 'unique/new/special' somehow.
Give it back to AI A or even a new AI C.
genshiryoku t1_j57dtsz wrote
Without going to deep into it. This is a symptom of Transformer models. My argument was why transformer models like GPT can't scale up.
It has to do with the mathematics behind training AI. Essentially for every piece of data the AI refines itself but for copies of data it overcorrects itself which results in inefficiency or worse performance. With synthetic data it kinda acts the same as duplicate data in that it overcorrects and worsens its own performance.
If you are truly interested you can see for yourself here.
And yes AI researchers are looking for models to detect what data is synthetic on the internet because it's inevitable that new data will be machine generated which can't be used to train on. If we fail at that task we might even enter an "AI dark age" where models get worse and worse with time because the internet will be filled with AI generated garbage data that can't be trained on. Which is the worst case scenario.
Gohoyo t1_j57fu2a wrote
Thanks for trying to help me btw.
I watched the video. I can understand why reading it's own data wouldn't work, but I can't understand why having it create a bunch of data and then altering the data, then giving it back to the AI wouldn't. The key here is that we have machines that can create data at super human speeds. There has to be some way to do something with that data to make it useful to the AI again, right?
genshiryoku t1_j57h1fb wrote
The "created data" is merely the AI mixing the training data in such a way that it "creates" something new. If the dataset is big enough this looks amazing and like the AI is actually creative and creating new things but from a mathematics perspective it's still just statistically somewhere in between the data it already has trained on.
Therefor it would be the same as feeding it its own data. To us it seems like completely new, and actually useable data though which is why ChatGPT is so exciting. But for AI training purposes it's useless.
Gohoyo t1_j57hihv wrote
If ChatGPT creates a paragraph, I then take that paragraph and alter it significantly, how is that new never before seen by AI or humans paragraph not new data for the AI?
genshiryoku t1_j57j6s1 wrote
It would be lower quality data but still usable if significantly altered. The question is. Why would you do this instead of just generating real data?
GPT is trained on human language it needs real interaction to learn from like the one we're having right now.
I'm also not saying that this isn't possible. We are AGI level intelligences and we absolutely consumed less data than GPT-3 did over our lifetimes so we know it's possible to reach AGI with relatively little data.
My original argument was merely that it's impossible with current transformer models like GPT and that we need another breakthrough in AI architecture to solve problems like this, not merely scale up current transformer models, because the training data is going to run out over the next couple of years as all of the internet will be used up.
Gohoyo t1_j57jyq4 wrote
> Why would you do this instead of just generating real data?
The idea would be that harnessing the AI's ability to create massive amounts of regurgitated old data quickly and then transmuting it into 'new data' somehow is faster than acquiring real data.
I mean I believe you, I'm not in this field nor a genius, so if the top AI people are seeing it as a problem then I have to assume it really is, I just don't understand it fully.
docamazing t1_j59aaud wrote
I think you are incorrect here.
Baturinsky t1_j5697wk wrote
I think AI will train from people using it
genshiryoku t1_j56btvq wrote
The problem is the total amount of data and the quality of the data. Humans using an AI like GPT-3 doesn't generate nearly enough data to properly train a new model, not even with decades of interaction.
The demand for training data scales logarithmically with the parameter scale of the transformer model. This essentially means that mathematically Transformer models are a losing strategy and isn't going to lead to AGI unless you had unlimited amount of training data, which we don't.
We need a different architecture.
[deleted] t1_j56o68x wrote
[deleted]
No-Dragonfly3202 t1_j56o52a wrote
I want this to go faster lol
dsiegel2275 t1_j578xvu wrote
There’s a lot of research happening on optimized training algorithms. Breakthroughs there can have some profound effects.
cloudrunner69 t1_j55l7tn wrote
S curves feeding off S curves.
[deleted] t1_j591c2i wrote
[deleted]
HeinrichTheWolf_17 t1_j55cb1y wrote
Regardless, it’s certainly happening this decade.
Ok_Homework9290 t1_j55tw2u wrote
It could possibly happen this decade, but I think that saying it's a certainty to do so is very optimistic. I personally doubt it will.
But I do think it's an almost certainty that it will happen this century. I think most AI researchers would agree with this statement, but not with it being a foregone conclusion this decade.
BadassGhost t1_j55vl62 wrote
I really struggle to see a hurdle on the horizon that will stop AGI from happening this decade, let alone in the next 3 years. It seems the only major problems is hallucination of truth and memory loss. I think both are solved by using retrieval datasets in a smart way.
ASI, on the other hand, definitely might be many years away. Personally I think it will happen this decade also, but that's less certain to me than AGI. Definitely possible that becoming significantly smarter than humans is really really difficult or impossible, although I imagine it isn't.
It will probably also be an extinction-level event. If not the first ASI, then the 5th, or the 10th, etc. Only way of humanity survival is if the first ASI gains a "decisive strategic advantage", as Nick Bostrom would call it, but uses that advantage to basically take over the entire world and prevent any new dangerous ASIs from being created
ArgentStonecutter t1_j563lt0 wrote
Self-improving AGI will be followed by ASI so quickly we'll be standing around like a pack of sheep wondering where the sheepdog came from.
BadassGhost t1_j5654od wrote
This is my guess as well, but I think it's much less certain than AGI happening quickly from this point. We know human intelligence is possible, and we can see that we're pretty close to that level already with LLMs (relative to other intelligences that we know of, like animals).
But we know of exactly 0 superintelligences, so it's impossible to be sure that it's as easy to achieve as human-level intelligence (let alone if it's even possible). That being said, it might not matter whether or not qualitative superintelligence is possible, since we could just make millions of AGIs that all run much faster than a human brain. Quantity/speed instead of quality
ArgentStonecutter t1_j56fhsa wrote
I don't think we're anywhere near human level intelligence, or even general mammalian intelligence. The current technology shows no signs of scaling up to human intelligence and there is fundamental research into the subject required before we have a grip on how to get there.
BadassGhost t1_j56i9dt wrote
https://github.com/google/BIG-bench/tree/main/bigbench/benchmark_tasks
LLMs are close to, equal to, or beyond human abilities in a lot of these tasks. Some of them, they're not there yet though. I'd argue this is pretty convincing that they are more intelligent than typically mammals in abstract thinking. Clearly animals are much more intelligent in other ways, even more so than humans in many different domains (e.g. chimps selecting 10 numbers on a screen in order from memory experiment). But in terms of high-level reasoning, they're pretty close to human performance
ArgentStonecutter t1_j56sxck wrote
Computers have been better than humans at an increasing number of tasks since before WWII. Many of these tasks, like Chess and Go, were once touted as requiring 'real' intelligence. No possible list of such tasks is even meaningful.
BadassGhost t1_j570h0y wrote
Then what would be meaningful? What would convince you that something is close to AGI, but not yet AGI?
For me, this is exactly what I would expect to see if something was almost AGI but not yet there.
The difference from previous specialized AI is that these models are able to learn seemingly any concept, both in training and after training (in context). Things that are out of distribution can be taught with a single digit number of examples.
(I am not the one downvoting you)
sumane12 t1_j567fqu wrote
I agree, short term memory, and long term learning, will avoid hallucinations, it does look like gpt3+ WolframAlpha seems to have solved this problem, although it's not a perfect solution, but will do for now.
I'm very much an immediate takeoff proponent when it comes to ASI. Not only can it think at light speed (humans tend to think at about the speed of sound) it has immediate access to the internet, it can duplicate itself over and over as long as there is sufficient hardware, and it's able to expand its knowledge infinitely expandable as long as you have more hard drive space.
With these key concepts, and again I'm assuming an agent that can act and learn like a human, I just don't see how it would not immediately super human in its abilities. It's self improvement might take a few years, but as I say, I just think it's ability to out class humans would be immediate.
Ok_Homework9290 t1_j568ayw wrote
>I really struggle to see a hurdle on the horizon that will stop AGI from happening this decade, let alone in the next 3 years.
To be honest, I base my predictions on the average predictions of AI/ML researchers. To my knowledge, only a minority of them believe we'll get there this decade, and even less in a mere 3 years.
>It seems the only major problems is hallucination of truth and memory loss.
As advanced as AI is today, it isn't even remotely close to being as generally smart as the average human. I think to close that gap, we would need a helluva lot more than than making an AI that is never spewing nonsense and can remember more things.
BadassGhost t1_j56atbp wrote
> To be honest, I base my predictions on the average predictions of AI/ML researchers. To my knowledge, only a minority of them believe we'll get there this decade, and even less in a mere 3 years.
I think there's an unintuitive part of being an expert that can actually cloud your judgement. Actually building these models and day-in-day out being immersed in the linear algebra, calculus, and data science makes you numb to the results and the extrapolation of them.
To be clear, I think amateurs who don't know how these systems work are much, much worse at predictions like this. I think the sweet middle ground is knowing exactly how they work, down to the math and actual code, but without being the actual creators whose day jobs are to create and perfect these systems. I think that's where the mind is clear to understand the actual implications of what's being created.
>As advanced as AI is today, it isn't even remotely close to being as generally smart as the average human. I think to close that gap, we would need a helluva lot more than than making an AI that is never spewing nonsense and can remember more things.
When I scroll through the list of BIG Bench examples, I feel that these systems are actually very close to human reasoning, with just missing puzzle pieces (mostly hallucination and long-term memory).
https://github.com/google/BIG-bench/tree/main/bigbench/benchmark_tasks
You can click through the folders and look at the task.json to see what it can do. There are comparisons to human labelers.
Aquamarinemammal t1_j57nkxi wrote
Just fyi, the condition that follows ‘let alone’ is usually the more conservative one. But I see plenty of hurdles, and I’m not convinced any of them can be overcome via scale or data-banks alone. Ability to remember and to distinguish truth from fiction are important, but LLMs also lack first-order logic and symbolic reasoning.
I think the last of these is going to be particularly tricky. I’m not aware of any substantial progress on abstraction for neural nets / ML in recent years; in fact, as I understand them, they seem fundamentally incapable of it. Giant functions / prediction machines just aren’t enough, and I struggle to see how people could think otherwise. This type of training detects concrete local patterns in the dataset, but that’s it - these models can’t generalize their observations in any way. Recurrent NNs and LSTMs maybe show some promise. I certainly wouldn’t get my hopes up that it’ll just be handed to us soon as an emergent property
BadassGhost t1_j5a7cip wrote
Fair, I should have swapped them!
What leads you to believe LLMs don't have first-order logic? I just tested it with ChatGPT and it seems to have a firm grasp on the concept. First-order logic seems to be pretty low on the totem pole of abilities of LLMs. Same with symbolic reasoning. Try it for yourself!
I am not exactly sure what you mean by abstraction for neural nets. Are you talking about having defined meanings of inputs, outputs, or internal parts of the model? I don't see why that would be necessary at all for general intelligence. It doesn't seem that humans have substantial, distinct, and defined meanings for most of the brain, except for language (spoken and internal). Which LLMs are also capable of.
The human brain seems to also be a giant function, as far as we can tell (ignoring any discussion about subjective experience, and just focusing on intelligence).
> This type of training detects concrete local patterns in the dataset, but that’s it - these models can’t generalize their observations in any way.
No offense, but this statement seems to really show a lack of knowledge about the last 6+ years of NLP progress. LLMs absolutely can generalize outside of the training set. That's kind of the entire point of why they've proved useful and why the funding for them has skyrocketed. You can ask ChatGPT to come up with original jokes using topics that you can be pretty certain have never been put together for a joke, you can ask it to read your code that has never been seen before and give recommendations and answers about it, you can ask it to invent new religions, etc etc.
These models are pretty stunning in their capability to generalize. That's the whole point!
TopicRepulsive7936 t1_j583aot wrote
GPTs are already superhuman.
Surur t1_j55s75d wrote
I feel that symbolic thinking still needs to be solved, but maybe this is an emergent property.
croto8 t1_j56jl1g wrote
I think symbolic thinking may be inextricably linked to a sense of self. To give AI what we think of as understanding requires context and the perceiver’s acknowledgement of the symbol in a larger setting, rather than just pattern recognition at scale.
EVJoe t1_j572xpw wrote
Consider synesthesia, the phenomenon wherein a sensory stimulus in one channel (let's say hearing) activates a sensory perception in another sensory channel (let's say vision).
Imagine you have synesthesia, and you're a pre-linguistic human surviving in the wild. You hear a tiger roar, and via synesthesia you also "see" bright red, then a member of your tribe gets eaten by a tiger while others flee.
For such a person, "seeing" red now has personal symbolic meaning associated with tiger danger.
Symbolism does not need to derive from culture or a larger system. All you need is the capacity to recognize patterns between various stimuli. What that looks like for a "mind" that isn't limited to human sensation is another question entirely
croto8 t1_j573xp6 wrote
Your example uses personal experience to create the symbolic representation and subsequent association. Kind of my point.
Edit: to further, pattern recognition could create a similar outcome, through using training data which has this symbolic pattern inherently, but without the personal experience, sense of risk, and context, it’s just gradient descent based on the objective function that was configured to emulate the process.
AsuhoChinami t1_j56ir0n wrote
>AGI occurs across a spectrum: from ‘error prone’, not-too smart or ‘savant’-like (sub-human reliability or intelligence or of limited scope)
This is making the definition so lenient to render it kind of meaningless, I think. Might as well say that Smarterchild was AGI for being smarter than a newborn baby. I think the "floor" for AGI should be something that's competent at almost all intellectual tasks (maybe not fully on par with humans, but competent) and is generally about as smart as a human who's at minimum on the lower end of average. (I think we'll get there during 2023-2024, plz don't kill me techno-skeptics)
Akimbo333 t1_j58h4r0 wrote
Dont worry i wont kill ya lol! Whereas that is very interesting! We don't even have a multimodal AI yet.
Desperate_Excuse1709 t1_j59bcqj wrote
Agi 2050
AsuhoChinami t1_j59bi38 wrote
Okay... thanks.
No_Ninja3309_NoNoYes t1_j568whv wrote
I think this is more wishful thinking and anchoring than something to be taken seriously. Which is fine. It's okay to dream. But AGI requires so many things that it's hard to list them all, but I will try:
- Learning how to learn
- Independent data acquisition
- Common sense. This is hard to define. It could mean a set of rules. It could mean a database of facts.
- Embodied AI
- Something that resembles brain plasticity. Brains can create synapses. Dendrites can branch.
- Wake sleep cycles. AI will be gathering data or its of no use to us. I mean, if we have to acquire and clean data for it, when will we get to enjoy VR? So AI will acquire data and then process it when the opportunity presents itself.
None of these items seem trivial to me. I don't see how they can all be done by 2024.
MrEloi t1_j55smpe wrote
Some say that ChatGPT is "just a database which spits out the most probable next word".
These naysayers should dive into how the transformer systems really work
It's clear (to me at least) that these systems embody most/much of what a true AI needs.
That linked article covers the next steps towards AI comprehensively.
Practical-Mix-4332 t1_j55k6se wrote
Some researchers (Huijeong Jeong and Vijay Namboodiri of the University of California, San Francisco) did a study recently that found we’ve been doing AI backwards the whole time.
CodytheGreat t1_j55obyk wrote
This what you're referring to? https://www.economist.com/science-and-technology/2023/01/18/a-decades-old-model-of-animal-and-human-learning-is-under-fire
I'm trying to find a non-paid write up on this. Sounds interesting.
tatleoat t1_j55q1xh wrote
kmtrp t1_j5azchw wrote
Yes baby.
ihateshadylandlords t1_j55u471 wrote
I hope so, only time will tell…
!RemindMe 2 years
RemindMeBot t1_j55u91k wrote
I will be messaging you in 2 years on 2025-01-20 16:39:15 UTC to remind you of this link
2 OTHERS CLICKED THIS LINK to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) ^(delete this message to hide from others.)
^(Info) | ^(Custom) | ^(Your Reminders) | ^(Feedback) |
---|
rixtil41 t1_j575bub wrote
!RemindMe 3 years
r0cket-b0i t1_j58upls wrote
It's funny how one can write a whole article full of details and yet not see that it logically collapses at a very basic, foundational understanding of the topic.
Chat GPT has no model of the world it operates in, it does not "understand" what it says, you absolutely cannot evolve that into AGI.
You can, however, have a very smart ai systems based on it that would accelerate human progress across multiple domains. Or you can fake it and make it feel as if it understands the context to make the user experience better but functionality it would not work, it's an "expert" type of an ai.
AGI by 2025 is possible, but chat GPT is a sign of overall industry progress not progress towards AGI necessarily, if we were able to have an AI that acted as a dog with all the consequences like ability to train it with voice and gesture commands, teaching it to solve a dynamic puzzle like managing a bunch of sheep in a field and then have it perform those tasks and learn to do better over time - that would be a very clear sign of progress to AGI and we do have some projects around that, chat GPT is not one of them.
DeveloperGuy75 t1_j59irqi wrote
It needs to have curiosity and be able to ask questions, but then it might start asking what some might think are the wrong questions
kmtrp t1_j5ae1py wrote
Paywall and the spaywall extension can't find the article. Are we all supposed to be paying members of medium or am I missing something obvious?
EDIT: Incognito works
ArgentStonecutter t1_j56392j wrote
LOL
kmtrp t1_j5aecj5 wrote
What?
ArgentStonecutter t1_j5ag73s wrote
I’m laughing at the funny post.
tatleoat t1_j55er3n wrote
I agree, up to this point most everything we've seen are cartoonishly simplified demonstrations in virtual worlds, or low stakes requests like retrieving a can of soda for you. I don't think these are simple little games because that's all AI is capable of right now, I think they're simple tasks just for demonstrative purposes and the AIs themselves are actually capable of much more as-is.
Couple this with the fact that the public is informed of AI progress much later than the AIs creation itself AND the fact the public can't know too many specifics because it's a national security risk AND it could be hooked up to GPT-4 AND it's multi modal AND OpenAI has 10 billion dollars to throw at compute AND we have AIs helping us create the next batch of AIs (plus much more going for it) then you have an insane amount of reasons why the truly life-changing stuff is so much closer at hand than you might otherwise intuit