Submitted by ReExperienceUrSenses t3_10s52md in Futurology
TL;DR: (and it's long)
What I am trying to argue here is that “intelligence” is complex enough to be inseparable from the physical processes that give rise to it, and if that is not convincing, the “computing” power necessary to mimic it is unobtainable with any of the machinery we have created to date and nothing is on the horizon. Anything from the future that would match up could not even be called a computer at that point because the inner workings would have to be radically different. Also some criticisms of Large Language Models and neural networks in general. They don't work the way people seem to think.
I make this post not because I’m trying to get into a “debate” where I try to beat everyone's opinion down or to be a doom n' gloom downer, but because I'm hoping for a discussion to work through my thoughts and maybe yours. I have been mulling over some problems with the field of Artificial Intelligence and in the process I have found myself convinced that it’s never going to happen.
So I present some questions and ideas to people who still believe the hype and those who may not be into the current hype but still believe it will happen eventually. I want to refine my thinking and see if there are holes in my reasoning because of anything I have missed. I’m perfectly willing to change my mind, I just need a convincing argument and some good evidence.
So with that out of the way we’ll start with this:
Nobody would ever say that a simulated star gives us a nuclear fusion reactor, yet we assume a simulated or emulated brain will give us a mind? Why? I know many of you are itching to trot out “we don’t flap wings to make planes fly! Do submarines SWIM?” but there is a massive flaw in this reasoning. We’ve worked out the principles that govern flight and underwater traversal, so we can create alternative methods towards these ends. We have NOT worked out the fundamental principles necessary to create intelligence/cognition/perception by any other means, all we're working with is what it feels like to think, which is very subjective. Neural networks are also not a simulation of neurons in any sense, neither replicating any of their “base” functionality in an abstract form nor trying to accurately model their attributes.
The limits of the current paradigm, and any future one, come from what I think is a fundamental misunderstanding of "the symbol grounding problem", or rather, what has to be dealt with in order to overcome the grounding problem. Without solving this, they will not have any generalized reasoning ability or common sense. Language models give us the illusion that we can solve this with words, and I think I can articulate why this is not the case. Word association is not enough.
How are our minds “grounded?” How do you define the meaning of the words we use, how do you define what anything actually IS. Words and definitions of words are meaningless symbols without us to interpret them. Definitions of words can be created endlessly, because the words within those definitions also need to be defined. You are stuck in an unending recursive loop, as there is no base case, only more arbitrary symbols. You can scale to infinite parameters for these “neural” networks and it will not matter. Imagine trying to make sense of a word cloud written in a foreign language that does not use your alphabet. The base case, the MEANING comes from visceral experience. So what are the fundamental things that make up an experience of reality, making a common sense understanding of things like cause and effect possible?
Much like a star, our brains are a real, physical object undergoing complicated processes. In a star, the fusion of atoms results in a massive release of heat and energy, and that release is what we want to capture in a reactor. In the cells of our brains, immensely complex biochemistry is carried out by the interactions of a vast number of molecular machines. Matter is being moved about and broken down for energy to carry out the construction of new materials and other processes.
We have grounding because in order to experience reality, we are both transformed by it and transformers of it. All of the activity carried out by a cell is the result of the laws of physics and chemistry playing out, natural selection iteratively refining the form and function of the molecules that prove useful in their environment for metabolism and self-replication.
Your brain isn’t taking in data to be used by algorithms, neurons are NOT passive logic circuit elements! Action potentials are not like clock cycles in computers, shunting voltage about along rigid paths of logic gated circuitry; their purpose is to activate a variety of other intracellular processes.
The cells of your brain and body are being literally transformed by their own contents and interactions with their environment, shaping and reshaping every moment of their activity. Photons of light hit the cells in your eye, triggering a sequential activation of the tiny finite state machines known as signal transduction proteins. The internal state of the cell transforms and neurotransmitter gets released, once again triggering sequential activation of signaling proteins in other cells downstream in the process. This is real chemical and mechanical transformation, a complex exchange of matter and energy between you and your environment. You understand cause and effect because every aspect of your being down to the molecule depends on and is molded by it. An experience is defined by the sum total of all of this activity happening not just in the cells of your brain but everywhere in your entire body. Perception and cognition are probably inseparable for this reason.
There is no need for models of anything in the brain. Nothing has to be abstracted out and processed by algorithms to produce a desired result. The physical activity and shifting state ARE the result, no further interpretation necessary.
Now let us examine what is actually happening in a deep learning system. The activity of neural networks is arbitrary-symbol manipulation. WE hand craft the constraints to retrieve desired results. Don’t let the fancy words and mathy-math of the blackbox impress you (or convince you to speculate that something deeper is happening), focus on examining the inputs and the outputs.
The fundamental flaw of a Large Language Model remains the same as the flaw of the expert systems. This flaw is again, the grounding problem, how it is that words get their meanings. The training dataset is the exact same thing as the prior art of hand coded logic rules and examples. Human beings are ranking the outputs of the chatbot for the value system the reinforcement mechanism will use to pick the most viable answer given a prompt. The black box is just averaging all of this together to be able match a statistically relevant output to the input. There is no reasoning going on here, these systems don't even handle simple negation well. It just appears like reasoning in an LLM because the structure of the words looks good to us, from the use of the vast corpus of text to find frequencies that words appear together.
Ask any linguist or psychologist, humans do not learn language like this, humans do not make and use language like this. I must emphasize that we are NOT just doing next word prediction in our heads. Kids won't pick up up language from passive exposure, even with tv.
You cannot attempt to use extra data sources like images to overcome this problem with labeled associations either. Which pixel values are the ones that represent the thing you are trying to associate, and why? Human beings are going into these data sets and labeling the images. Human beings are going in and setting the constraints of the games(possible state space, how to transition between states, formalization of the problem). Human interpretation is hiding somewhere in all of these deep learning systems, we have not actually devised any methods that work without us.
While the individual human beings labeling the data attempt to define what red is for the machine, with words and pixel values, merely even thinking about “red” is literally altering the chemistry all across their brain in order to re-experience incidents where they encountered that wavelength of electromagnetic radiation and what transpired after.
This is why there cannot be grounding and common sense in these systems; the NN cant ever “just know” like life can because it cannot directly experience reality without it being interpreted first by us. It’s a big bunch of matrix math that only has a statistical model of tokens of text and pixel values by averaging symbols of our experience of reality. Even the output only has meaning because the output is meaningful to us. They do absolutely NOTHING on their own. How can they perform dynamic tasks in unstructured environments without us to painstakingly define and structure everything first?
Change the labels? You change the whole outcome.
You cant change the laws of physics.
We exist in the moments when molecules bump into each other. You can’t simulate that you have to DO it. Because the variance in how these bumps occur produces all of our differences and fallibility and flexibility.
The molecular dynamics are not only still too unknown to distill into an algorithm, but too complex to even simulate in real time. There isn’t enough computing power on the planet to simulate all of the the action in a single cell let alone the trillions that we are made of, in a human time frame with reliable accuracy.
Bonus: Moravec’s paradox is still kicking our ass. Single celled organisms (eukaryotic specifically) and the individual cells in our immune system navigate unstructured environments and complete specific and complex tasks in a manner that puts all of our robots to shame. Remember cells as tiny molecular robots composed of the assemblage of an incredible amount of complex, nested finite state machines, and then watch the Kurzgesagt videos about the immune system. The “computing” power on display is unmatched.
ttkciar t1_j706e2s wrote
That's not a bad line of reasoning, but I posit that your leap from "deep learning systems will never be AGI" to "AGI is never going to happen" might be unfounded.