[deleted] t1_iyalxfk wrote
Reply to comment by mrconter1 in [r] The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable - LessWrong by visarga
[deleted]
-horses t1_iyb519c wrote
Early climatologists applied very basic and well known theories of the physical world to reach an unexpected conclusion, which they presented to the public with humility despite the evident urgency. Early AGI theorists were inspired by science fiction and chose models of rationality that were generically uncomputable or intractable, infinitely-powerful computers, ideal Bayesian agents, and so on, in order to reach a conclusion they already believed. Then they started recruiting for world domination schemes to stop it, without ever consulting the public, and in the face of sharp criticism from actual scientists. These are not the same.
I do think very intelligent machines could be dangerous for us, and there are serious scientists who agree, and there have been since before this movement came around, but this ain't it.
[deleted] t1_iyb7s5s wrote
[deleted]
-horses t1_iyb8g5u wrote
There are different kinds of modeling, which can be wrong in very different ways. If I tell you I have used a coarse grained model to predict the weather, you should expect my predictions to be somewhat off. If I tell you there is an effectively realizable device with hypercomputational abilities, you should tell me straight up that I am wrong.
Also, I cannot emphasize enough that Yudkowsky was a 20 year old poster with no formal training, high school, college, or professional coding experience, when he drafted the above scheme to supersede all governments on Earth by force.
[deleted] t1_iyb8ps9 wrote
[deleted]
[deleted] t1_iycuggx wrote
[deleted]
-horses t1_iycwy1e wrote
>I feel like our primary disagreement would not be on the facts but on whether it's justified to use the phrase Doomsday cultists for people who are flailing in response to a genuine threat. Is that fair?
Cultism is a social phenomenon with common patterns, and I think the shoe fits here, independent of the belief system. Once I lived with a follower of Adi Da, who believed the world must awaken to a new level of consciousness, which he generally described in terms of collective stewardship of the environment. I agreed with that, but I would call him a cult member, not because he took that belief to an extreme or didn't base it on sound science, but because he was a manipulable psychological type (a serial joiner of new movements in the 1960s) recruited by a prophet in order to proselytize a vision of the world in which a select few have the level of devotion required to play a decisive role in the fate of humanity. Similarly, I think the AGI people take advantage of anxiety-disordered young men, like I used to be.
[deleted] t1_iycxhtm wrote
[deleted]
sdmat t1_iyc3x8t wrote
> If I tell you there is an effectively realizable device with hypercomputational abilities, you should tell me straight up that I am wrong.
We should tell you that extraordinary claims demand extraordinary evidence.
> Also, I cannot emphasize enough that Yudkowsky was a 20 year old poster with no formal training, high school, college, or professional coding experience, when he drafted the above scheme to supersede all governments on Earth by force.
So? Plenty of people who have made contributions to philosophy and science have been autodidacts with very weird ideas.
The wonderful thing about open discussion and the marketplace of ideas is that we are under no obligation to adopt crazy notions.
-horses t1_iycp4v2 wrote
>We should tell you that extraordinary claims demand extraordinary evidence.
Again, computational modeling is not the same as physical modeling and the same intuitions will not serve. If I gave you a box and told you it solved halting, and you observed it give correct output for each input you fed it until the day you died, you would have gathered no evidence it was a hypercomputer. Finite beings like us are not capable of making the observations one would need to show the claim that way. On the other hand, we are capable of proving logically that no finite box can do this.
sdmat t1_iycv8zl wrote
I agree that a hypercomputer is almost certainly impossible and it would be difficult to prove.
But your standard of proof is absurd - do we only accept a computer as correct on demonstrating correctness for every operation on every possible state?
No, we look inside the box. We verify the principles of operation, translation to physical embodiment, and test with limited examples. The first computer was verified without computer assistance.
You might object that this is a false analogy because the computational model is different for a hypercomputer. But if we verify the operation of a plausible embodiment of hypercomputation on a set of inputs that we happen to know the answer for, that does tell us something. If the specific calculation we can validate is the same in kind as the calculations we can't validate, a mere difference in input values to which the same mechanism of calculation applies, then it is a form of proof. In the same sense we prove the correctness of 64 bit arithmetic units despite not being able to test every input.
What those principles of operation and plausible embodiment might look like, no idea. As I said it's probably impossible. But you would need to actually prove it to be impossible to completely dismiss the notion.
-horses t1_iycz7ys wrote
Hypercomputation is an extremely extraordinary and extremely specific claim.
>But if we verify the operation of a plausible embodiment of hypercomputation on a set of inputs that we happen to know the answer for, that does tell us something. If the specific calculation we can validate is the same in kind as the calculations we can't validate, a mere difference in input values to which the same mechanism of calculation applies, then it is a form of proof.
Here's a machine that produces the number of steps for each of the known 2-symbol Busy Beaver machines, and then keeps giving answers using the same mechanism.
On input n:
answers = [1, 6, 14, 107]
return answers[n % 4]
Hypercomputation confirmed? (Note: we could easily change the last line so further outputs were monotonically increasing and larger than the step number for the current candidate for BB_2(5), while keeping correctness on the first four. Imagine an infinite series such machines, each more cleverly obfuscatory than the last; they exist.)
>The first computer was verified without computer assistance.
The first computer was verified by human computers with greater computational power than it had.
edit: And rewinding a bit, the original claim was that there's an effectively realizable device, that is, one which can be implemented, and whose implementation can be accurately described with finite time, space, and description length, ie by a TM, the usual sense of 'effective'. If this were the case, the TM could just simulate it, proving it was not a hypercomputer. This is the sense in which the claim is flat-out wrong, aside from the difficulty of trying to evaluate it with 'evidence'.
sdmat t1_iyf7dck wrote
> Hypercomputation confirmed? (Note: we could easily change the last line so further outputs were monotonically increasing and larger than the step number for the current candidate for BB_2(5), while keeping correctness on the first four. Imagine an infinite series such machines, each more cleverly obfuscatory than the last; they exist.)
No, because there is no plausible computational principle giving the answer to the general Busy Beaver problem embodied in that system. Notably, it's a turing machine.
An inductive proof needs to establish that the inductive step is valid - that there is a path from the base case to the result, even if we can't enumerate the discrete steps we would take to get there.
By analogy proof of hypercomputation would need to establish that the mechanism of hypercomputation works for verifiable examples and that this same mechanism extends to examples we can't directly verify.
Of course this makes unicorn taxonomy look down to earth and likely.
> edit: And rewinding a bit, the original claim was that there's an effectively realizable device, that is, one which can be implemented, and whose implementation can be accurately described with finite time, space, and description length, ie by a TM, the usual sense of 'effective'. If this were the case, the TM could just simulate it, proving it was not a hypercomputer. This is the sense in which the claim is flat-out wrong, aside from the difficulty of trying to evaluate it with 'evidence'.
That's a great argument if the universe is Turing-equivalent. That may be the case, but how to prove it?
If the universe isn't Turing-equivalent then it's conceivable that we might be able to set up a hypercalculation supported by some currently unknown physical quirk. Doing so would not necessarily involve infinite dimensions - you are deriving those from the behavior of Turing machines.
An example non-Turing universe is one where Real numbers are physical, I.e. it is fundamentally non-discretizable. I have no idea if that would be sufficient to allow hypercomputation, but it breaks the TM isomorphism.
-horses t1_iyf9a66 wrote
>No, because there is no plausible computational principle giving the answer to the Busy Beaver problem embodied in that system. Notably, it's a turing machine.
Yes, there is no plausible computational principle giving the answer to the Busy Beaver problem in any system, because it is not computable. The point was that you can't trust a machine simply because it produces the known answers correctly and keeps going the same way.
>Doing so would not necessarily involve infinite dimensions - you are deriving those from the behavior of Turing machines.
You need an infinite resource available to get anywhere past finite automata, you just don't have to actually use infinite resources until you get past TMs. Non-automatic models of computation aren't relevant to measurable behavior of dynamical systems in the real world.
>That's a great argument if the universe is Turing-equivalent. That may be the case, but how to prove it?
No, it isn't. It's an observation that 'effective' has a standard definition which precludes hypercomputation. Any effective computation is simulable by a Turing machine; that's not the physical Church-Turing thesis, it's the vanilla version. (edit: and the reason I put the word in there originally is that any AGI implemented with computers would be in that boat, while many models AGI theorists prefer would not be, but are intended to represent real-world systems that would. Thus, they are often claiming to have effective means to non-effective ends.)
>An example non-Turing universe is one where Real numbers are physical, I.e. it is fundamentally non-discretizable. I have no idea if that would be sufficient to allow hypercomputation, but it breaks the TM isomorphism.
This is an example of falling back on infinite information in finite space. If space is continuous, it contains all the uncomputable reals. If you doubt this requires infinite information, consider that these include the incompressible strings of infinite length. A system moving through such a space would adopt states that require infinite information to describe infinitely often. It still wouldn't allow us to show any hypercomputation, though; our ability to observe and communicate remains finite, and all finite observations are explicable by finite machines, well within computability.
linearmodality t1_iyb0ojk wrote
An idea that is actually sound generally does not need to bolster its credibility by dubbing itself a "thesis" or by using unrelated technobabble (the notion of orthogonality here is nonsense: there's no objectively defined inner product space we're working in).
Of course, also the orthogonality thesis was not invented by lesswrong and lesswrong does a pretty poor job of representing Bostrom's work. So there are multiple issues here.
[deleted] t1_iyb0wd5 wrote
[deleted]
linearmodality t1_iyb29f2 wrote
And not everyone who participates in LessWrong is Eliezer Yudkowsky or Stuart Armstrong (even they themselves lack the general intellectual coherence of Bostrom). But even if everyone on LessWrong were Nick Bostrom himself, the core problem remains: the "orthogonality thesis" is fundamentally flawed. (It hides these flaws by being purposefully vague about how "goals" and "intelligence" are mapped to vectors and what the inner product space is. If you try to nail these things down the statement either becomes false, vacuous, or trivial.)
[deleted] t1_iyb4i3i wrote
[deleted]
linearmodality t1_iyb8fij wrote
Well intelligence is correlated with willingness to do what people want. This is very straightforward to observe in natural intelligences. The most intelligent beings (adult humans) are the most willing to do what people want. This is also presently true for existing AI agents, if being "willing" even makes sense for such agents: the ones that possess better problem-solving abilities are more "willing" (because they are more able) to do what people want. This is so clearly the case that I suspect you mean something other than "correlated" here.
>It's fucking hard to specify what we want AI systems to do in ways that avoid undesirable side effects. Everyone agrees on this with respect to current AI. The only remaining question is whether we should expect it to become easier or harder to control machine intelligences as they become more sophisticated.
Well, that's the wrong question. Yes, it's hard to specify what we want a system to do in a way that avoids side effects. However, this hardness is a property of the specification, not of the learned model itself. It doesn't get harder or easier as the model becomes more accurate, because it's independent of the model.
>Do you personally, really and honestly, believe that it's so obvious that control will get easier as intelligence gets greater
Certainly it will get easier to produce specifications of what we want an AI system to do in a way that avoids undesirable side effects because we can get a sufficiently intelligent AI to write the specifications—and furnish us with a proof of safety (that the specification will guarantee that we avoid the undesirable side effects). "Control" is a more general word, though, and you'll have to nail down exactly what it means before we can evaluate whether we should expect it will get easier or harder over time.
>that you'd label people who worry otherwise as cultists?
Oh, LessWrongers aren't cultists because they worry otherwise. There are lots of perfectly reasonable non-cultists who worry otherwise, like Stuart Russell.
[deleted] t1_iyb99qw wrote
[deleted]
linearmodality t1_iybbrz3 wrote
> There's an extremely obvious restricted range problem here
Then you're not talking about actual correlation over the distribution of actually extant intelligent agents, but rather about something else. In which case: what are you talking about?
>This is literally the Orthogonality Thesis stated in plain English.
Well, no. The orthogonality thesis asserts, roughly, that an AI agent's intelligence and goals are somehow orthogonal. Here, we're talking about an AI agent's intelligence and the difficulty of producing a specification for a given task that avoids undesirable side effects. "Goals" and "the difficulty of producing a specification" are hardly the same thing.
>I don't think that this solution will work.
This sort of approach is already working. On the one side we have tools like prompt engineering that automatically develop specifications of what an AI system should do, for things like zero-shot learning. On the other side we have robust control results which guarantee that undesirable outcomes are avoided, even when a learned agent is used as part of the controller. There's no reason to think that improvements in this space won't continue.
Even if they don't, the problem of producing task specifications does not get worse with AI intelligence (because as we've already seen, the difficulty of producing a specification is independent) which is fundamentally inconsistent with the LessWrongist viewpoint.
[deleted] t1_iybd2qu wrote
[deleted]
sdmat t1_iyc8ab6 wrote
> The problem of producing task specifications does not get worse with AI intelligence (because as we've already seen, the difficulty of producing a specification is independent) which is fundamentally inconsistent with the LessWrongist viewpoint.
I think LW viewpoint is that for the correctness of a task specification to be genuinely independent of the AI it is necessary to include preferences that cover the effects of all possible ways to execute the task.
The claim is that for our present AIs we don't need to be anywhere near this specific only because they can't do very much - we can accurately predict the general range of possible actions and the kinds of side effects they might cause in executing the task, so only need to worry about whether we get useful results.
Your view is that this is refuted by the existence of approaches that generate a task specification and check execution against the specification. I don't see how that follows - the LW concern is precisely that this kind of ad-hoc understanding of what we actually mean by the original request is only safe for today's less capable systems.
ThePerson654321 t1_iyanvoe wrote
Oh boy here we go again...
Viewing a single comment thread. View all comments