beezlebub33 t1_iy8b3ht wrote on November 29, 2022 at 2:35 PM

#764,574

This is very interesting, if somewhat dense and hard to follow if you don't have some of the background.

I recommend reading an article they reference: A Mathematical Framework for Transformer Circuits https://transformer-circuits.pub/2021/framework/index.html

If nothing else, that paper will explain that OV means output-value:

>Attention heads can be understood as having two largely independent
computations: a QK (“query-key”) circuit which computes the attention
pattern, and an OV (“output-value”) circuit which computes how each
token affects the output if attended to.

visarga OP t1_iy8xctv wrote on November 29, 2022 at 5:11 PM

#770,014

Replying to beezlebub33 (#764,574)

Oh yes, for people who prefer video there is also

CS25 I Stanford Seminar - Transformer Circuits, Induction Heads, In-Context Learning

[deleted] t1_iy90cdq wrote on November 29, 2022 at 5:30 PM

#770,793

[deleted]

mrconter1 t1_iy91oll wrote on November 29, 2022 at 5:39 PM

#771,109

Can we please keep LW out of this subreddit... It's literally a doomsday cult.

Edit: Feel feel to read about LW. I don't like them and I would prefer this subreddit to not legitimize them. Their movement have a whole subreddit dedicated to them:

/r/sneerclub

Edit 2: Basically, I preferred articles from a journal and people with academic prestige.

KnowledgeInChaos t1_iy94j7y wrote on November 29, 2022 at 5:58 PM

#771,818

Replying to mrconter1 (#771,109)

They've got some kooky fears but the scientifically-grounded analysis that they do are decent. (As opposed to say, some of the more speculative/philosophical musings that takes some heavy-handed assumptions about risk.)

Equivalent-Way3 t1_iy996pf wrote on November 29, 2022 at 6:28 PM

#772,938

Replying to mrconter1 (#771,109)

> It's literary a doomsday cult.

Can you explain what you're referring to?

visarga OP t1_iy9cm38 wrote on November 29, 2022 at 6:50 PM

#773,725

Replying to mrconter1 (#771,109)

I am not contradicting you, but we should decide on a case by case basis, some of the articles are ok. This one is not doomsday-related at all.

beezlebub33 t1_iy9k1rz wrote on November 29, 2022 at 7:39 PM

#775,317

Replying to mrconter1 (#771,109)

It doesn't make sense to have a blanket ban on articles posted on it.

-horses t1_iy9kg09 wrote on November 29, 2022 at 7:41 PM

#775,405

Replying to visarga (#773,725)

Aum Shinrikyo also had a lot of skilled engineers

crude2refined t1_iy9lbuv wrote on November 29, 2022 at 7:47 PM

#775,567

To be fair, the SVD of any network architecture trained on a dataset will exhibit such properties. See, for example, the emergence of “V1 features” in MLPs, CNNs, etc when training on image datasets

literum t1_iy9spar wrote on November 29, 2022 at 8:33 PM

#777,322

Replying to crude2refined (#775,567)

Could you give a reference for "V1 features"? I couldn't find anything googling.

beezlebub33 t1_iy9tozq wrote on November 29, 2022 at 8:39 PM

#777,572

Replying to literum (#777,322)

V1 refers the the mammalian primary visual cortex. http://www.scholarpedia.org/article/Area_V1

Cells in V1 respond to simple features such as lines of various orientations, certain simple frequencies, colors, etc. the article discusses it more.

The first layer of a CNN does something similar.

[deleted] t1_iya0tst wrote on November 29, 2022 at 9:24 PM

#779,024

Replying to -horses (#775,405)

[deleted]

SlayahhEUW t1_iya6l1f wrote on November 29, 2022 at 10:02 PM

#780,239

Replying to literum (#777,322)

Roughly it's the base mammalian feature extractors. This can also be found by performing principal component analysis of the data, the first layer of a CNN will after training have the same representation as the PCA of the data.

-horses t1_iyafoz9 wrote on November 29, 2022 at 11:05 PM

#782,265

Replying to Equivalent-Way3 (#772,938)

Here is the last missive from the leader I heard about. The emphasis on "pivotal acts" is new to me, this is his latest iteration on the theory that stopping AGI will require using an AGI superweapon to take over the world. It sounds more like an overt endorsement of terrorism than previous iterations, which sounded like wacky sci-fi. Especially as he mentions (in point 6) what he is describing is not actually what he means, but just a plausibly-deniable non-example to float in public.

Equivalent-Way3 t1_iyaj9rw wrote on November 29, 2022 at 11:31 PM

#783,037

Replying to -horses (#782,265)

Welp, first sentence lmao

> I have several times failed to write up a well-organized list of reasons why AGI will kill you.

Immediately closes tab

[deleted] t1_iyaldub wrote on November 29, 2022 at 11:46 PM

#783,518

[deleted]

mrconter1 t1_iyaljxc wrote on November 29, 2022 at 11:47 PM

#783,561

Replying to beezlebub33 (#775,317)

No but I don't want people to think that it's just a random blog. People who spend a lot of time there are... How should I phrase it? A bit different.

[deleted] t1_iyalxfk wrote on November 29, 2022 at 11:50 PM

#783,649

Replying to mrconter1 (#771,109)

[deleted]

learn-deeply t1_iyamfqj wrote on November 29, 2022 at 11:54 PM

#783,755

Replying to mrconter1 (#771,109)

LessWrong is a silly place where people take themselves too seriously, but its pretty cringy to have a subreddit dedicated to make fun of those people.

Significant-Fox884 t1_iyan85a wrote on November 29, 2022 at 11:59 PM

#783,927

Replying to mrconter1 (#771,109)

Doomsday cult also autistic people also ai swa$$ers with vague competences as the Yudkovsky.

ThePerson654321 t1_iyanu6k wrote on November 30, 2022 at 12:04 AM

#784,079

Replying to learn-deeply (#783,755)

Do you think there's a reason why LW has a community dedicated to make fun of things they say compared to say /r/machinelearning?

ThePerson654321 t1_iyanvoe wrote on November 30, 2022 at 12:04 AM

#784,090

Replying to [deleted] (#783,649)

Oh boy here we go again...

ipostr08 t1_iyau53p wrote on November 30, 2022 at 12:51 AM

#785,380

Replying to mrconter1 (#771,109)

You sound like somebody who has a well-calibrated bullshit detector: https://www.reddit.com/r/ivermectin/comments/pb8nhr/this_is_very_scary/

linearmodality t1_iyb0ojk wrote on November 30, 2022 at 1:40 AM

#786,999

Replying to [deleted] (#783,649)

An idea that is actually sound generally does not need to bolster its credibility by dubbing itself a "thesis" or by using unrelated technobabble (the notion of orthogonality here is nonsense: there's no objectively defined inner product space we're working in).

Of course, also the orthogonality thesis was not invented by lesswrong and lesswrong does a pretty poor job of representing Bostrom's work. So there are multiple issues here.

[deleted] t1_iyb0wd5 wrote on November 30, 2022 at 1:42 AM

#787,056

Replying to linearmodality (#786,999)

[deleted]

linearmodality t1_iyb29f2 wrote on November 30, 2022 at 1:52 AM

#787,418

Replying to [deleted] (#787,056)

And not everyone who participates in LessWrong is Eliezer Yudkowsky or Stuart Armstrong (even they themselves lack the general intellectual coherence of Bostrom). But even if everyone on LessWrong were Nick Bostrom himself, the core problem remains: the "orthogonality thesis" is fundamentally flawed. (It hides these flaws by being purposefully vague about how "goals" and "intelligence" are mapped to vectors and what the inner product space is. If you try to nail these things down the statement either becomes false, vacuous, or trivial.)

learn-deeply t1_iyb2qum wrote on November 30, 2022 at 1:56 AM

#787,545

Replying to ThePerson654321 (#784,079)

/r/machinelearning is more mainstream than LW and is less of a community. It's easy to bully weirdos.

flufylobster1 t1_iyb3gau wrote on November 30, 2022 at 2:02 AM

#787,761

Replying to Equivalent-Way3 (#783,037)

I'm lookin

[deleted] t1_iyb4i3i wrote on November 30, 2022 at 2:10 AM

#788,041

Replying to linearmodality (#787,418)

[deleted]

-horses t1_iyb519c wrote on November 30, 2022 at 2:14 AM

#788,194

Replying to [deleted] (#783,649)

Early climatologists applied very basic and well known theories of the physical world to reach an unexpected conclusion, which they presented to the public with humility despite the evident urgency. Early AGI theorists were inspired by science fiction and chose models of rationality that were generically uncomputable or intractable, infinitely-powerful computers, ideal Bayesian agents, and so on, in order to reach a conclusion they already believed. Then they started recruiting for world domination schemes to stop it, without ever consulting the public, and in the face of sharp criticism from actual scientists. These are not the same.

I do think very intelligent machines could be dangerous for us, and there are serious scientists who agree, and there have been since before this movement came around, but this ain't it.

LostInSpace2981 t1_iyb7adw wrote on November 30, 2022 at 2:31 AM

#788,753

Replying to beezlebub33 (#764,574)

This is great, thank you for sharing!

[deleted] t1_iyb7s5s wrote on November 30, 2022 at 2:34 AM

#788,904

Replying to -horses (#788,194)

[deleted]

linearmodality t1_iyb8fij wrote on November 30, 2022 at 2:39 AM

#789,092

Replying to [deleted] (#788,041)

Well intelligence is correlated with willingness to do what people want. This is very straightforward to observe in natural intelligences. The most intelligent beings (adult humans) are the most willing to do what people want. This is also presently true for existing AI agents, if being "willing" even makes sense for such agents: the ones that possess better problem-solving abilities are more "willing" (because they are more able) to do what people want. This is so clearly the case that I suspect you mean something other than "correlated" here.

>It's fucking hard to specify what we want AI systems to do in ways that avoid undesirable side effects. Everyone agrees on this with respect to current AI. The only remaining question is whether we should expect it to become easier or harder to control machine intelligences as they become more sophisticated.

Well, that's the wrong question. Yes, it's hard to specify what we want a system to do in a way that avoids side effects. However, this hardness is a property of the specification, not of the learned model itself. It doesn't get harder or easier as the model becomes more accurate, because it's independent of the model.

>Do you personally, really and honestly, believe that it's so obvious that control will get easier as intelligence gets greater

Certainly it will get easier to produce specifications of what we want an AI system to do in a way that avoids undesirable side effects because we can get a sufficiently intelligent AI to write the specifications—and furnish us with a proof of safety (that the specification will guarantee that we avoid the undesirable side effects). "Control" is a more general word, though, and you'll have to nail down exactly what it means before we can evaluate whether we should expect it will get easier or harder over time.

>that you'd label people who worry otherwise as cultists?

Oh, LessWrongers aren't cultists because they worry otherwise. There are lots of perfectly reasonable non-cultists who worry otherwise, like Stuart Russell.

-horses t1_iyb8g5u wrote on November 30, 2022 at 2:39 AM

#789,098

Replying to [deleted] (#788,904)

There are different kinds of modeling, which can be wrong in very different ways. If I tell you I have used a coarse grained model to predict the weather, you should expect my predictions to be somewhat off. If I tell you there is an effectively realizable device with hypercomputational abilities, you should tell me straight up that I am wrong.

Also, I cannot emphasize enough that Yudkowsky was a 20 year old poster with no formal training, high school, college, or professional coding experience, when he drafted the above scheme to supersede all governments on Earth by force.

[deleted] t1_iyb8ps9 wrote on November 30, 2022 at 2:42 AM

#789,185

Replying to -horses (#789,098)

[deleted]

[deleted] t1_iyb99qw wrote on November 30, 2022 at 2:46 AM

#789,341

Replying to linearmodality (#789,092)

[deleted]

linearmodality t1_iybbrz3 wrote on November 30, 2022 at 3:05 AM

#790,081

Replying to [deleted] (#789,341)

> There's an extremely obvious restricted range problem here

Then you're not talking about actual correlation over the distribution of actually extant intelligent agents, but rather about something else. In which case: what are you talking about?

>This is literally the Orthogonality Thesis stated in plain English.

Well, no. The orthogonality thesis asserts, roughly, that an AI agent's intelligence and goals are somehow orthogonal. Here, we're talking about an AI agent's intelligence and the difficulty of producing a specification for a given task that avoids undesirable side effects. "Goals" and "the difficulty of producing a specification" are hardly the same thing.

>I don't think that this solution will work.

This sort of approach is already working. On the one side we have tools like prompt engineering that automatically develop specifications of what an AI system should do, for things like zero-shot learning. On the other side we have robust control results which guarantee that undesirable outcomes are avoided, even when a learned agent is used as part of the controller. There's no reason to think that improvements in this space won't continue.

Even if they don't, the problem of producing task specifications does not get worse with AI intelligence (because as we've already seen, the difficulty of producing a specification is independent) which is fundamentally inconsistent with the LessWrongist viewpoint.

[deleted] t1_iybd2qu wrote on November 30, 2022 at 3:16 AM

#790,448

Replying to linearmodality (#790,081)

[deleted]

Objective-Patient-37 t1_iybgs7w wrote on November 30, 2022 at 3:46 AM

#791,530

Need more github links.

Only one I could find in the sources shared was: https://github.com/anthropics/PySvelte

sdmat t1_iyc3x8t wrote on November 30, 2022 at 7:55 AM

#797,697

Replying to -horses (#789,098)

> If I tell you there is an effectively realizable device with hypercomputational abilities, you should tell me straight up that I am wrong.

We should tell you that extraordinary claims demand extraordinary evidence.

> Also, I cannot emphasize enough that Yudkowsky was a 20 year old poster with no formal training, high school, college, or professional coding experience, when he drafted the above scheme to supersede all governments on Earth by force.

So? Plenty of people who have made contributions to philosophy and science have been autodidacts with very weird ideas.

The wonderful thing about open discussion and the marketplace of ideas is that we are under no obligation to adopt crazy notions.

sdmat t1_iyc8ab6 wrote on November 30, 2022 at 8:58 AM

#798,751

Replying to linearmodality (#790,081)

> The problem of producing task specifications does not get worse with AI intelligence (because as we've already seen, the difficulty of producing a specification is independent) which is fundamentally inconsistent with the LessWrongist viewpoint.

I think LW viewpoint is that for the correctness of a task specification to be genuinely independent of the AI it is necessary to include preferences that cover the effects of all possible ways to execute the task.

The claim is that for our present AIs we don't need to be anywhere near this specific only because they can't do very much - we can accurately predict the general range of possible actions and the kinds of side effects they might cause in executing the task, so only need to worry about whether we get useful results.

Your view is that this is refuted by the existence of approaches that generate a task specification and check execution against the specification. I don't see how that follows - the LW concern is precisely that this kind of ad-hoc understanding of what we actually mean by the original request is only safe for today's less capable systems.

-horses t1_iycp4v2 wrote on November 30, 2022 at 12:40 PM

#802,904

Replying to sdmat (#797,697)

>We should tell you that extraordinary claims demand extraordinary evidence.

Again, computational modeling is not the same as physical modeling and the same intuitions will not serve. If I gave you a box and told you it solved halting, and you observed it give correct output for each input you fed it until the day you died, you would have gathered no evidence it was a hypercomputer. Finite beings like us are not capable of making the observations one would need to show the claim that way. On the other hand, we are capable of proving logically that no finite box can do this.

[deleted] t1_iycuggx wrote on November 30, 2022 at 1:31 PM

#804,584

Replying to -horses (#789,098)

[deleted]

sdmat t1_iycv8zl wrote on November 30, 2022 at 1:38 PM

#804,838

Replying to -horses (#802,904)

I agree that a hypercomputer is almost certainly impossible and it would be difficult to prove.

But your standard of proof is absurd - do we only accept a computer as correct on demonstrating correctness for every operation on every possible state?

No, we look inside the box. We verify the principles of operation, translation to physical embodiment, and test with limited examples. The first computer was verified without computer assistance.

You might object that this is a false analogy because the computational model is different for a hypercomputer. But if we verify the operation of a plausible embodiment of hypercomputation on a set of inputs that we happen to know the answer for, that does tell us something. If the specific calculation we can validate is the same in kind as the calculations we can't validate, a mere difference in input values to which the same mechanism of calculation applies, then it is a form of proof. In the same sense we prove the correctness of 64 bit arithmetic units despite not being able to test every input.

What those principles of operation and plausible embodiment might look like, no idea. As I said it's probably impossible. But you would need to actually prove it to be impossible to completely dismiss the notion.

-horses t1_iycwy1e wrote on November 30, 2022 at 1:53 PM

#805,417

Replying to [deleted] (#804,584)

>I feel like our primary disagreement would not be on the facts but on whether it's justified to use the phrase Doomsday cultists for people who are flailing in response to a genuine threat. Is that fair?

Cultism is a social phenomenon with common patterns, and I think the shoe fits here, independent of the belief system. Once I lived with a follower of Adi Da, who believed the world must awaken to a new level of consciousness, which he generally described in terms of collective stewardship of the environment. I agreed with that, but I would call him a cult member, not because he took that belief to an extreme or didn't base it on sound science, but because he was a manipulable psychological type (a serial joiner of new movements in the 1960s) recruited by a prophet in order to proselytize a vision of the world in which a select few have the level of devotion required to play a decisive role in the fate of humanity. Similarly, I think the AGI people take advantage of anxiety-disordered young men, like I used to be.

[deleted] t1_iycxhtm wrote on November 30, 2022 at 1:57 PM

#805,601

Replying to -horses (#805,417)

[deleted]

-horses t1_iycz7ys wrote on November 30, 2022 at 2:11 PM

#806,163

Replying to sdmat (#804,838)

Hypercomputation is an extremely extraordinary and extremely specific claim.

>But if we verify the operation of a plausible embodiment of hypercomputation on a set of inputs that we happen to know the answer for, that does tell us something. If the specific calculation we can validate is the same in kind as the calculations we can't validate, a mere difference in input values to which the same mechanism of calculation applies, then it is a form of proof.

Here's a machine that produces the number of steps for each of the known 2-symbol Busy Beaver machines, and then keeps giving answers using the same mechanism.

On input n:
    answers = [1, 6, 14, 107]
    return answers[n % 4]

Hypercomputation confirmed? (Note: we could easily change the last line so further outputs were monotonically increasing and larger than the step number for the current candidate for BB_2(5), while keeping correctness on the first four. Imagine an infinite series such machines, each more cleverly obfuscatory than the last; they exist.)

>The first computer was verified without computer assistance.

The first computer was verified by human computers with greater computational power than it had.

edit: And rewinding a bit, the original claim was that there's an effectively realizable device, that is, one which can be implemented, and whose implementation can be accurately described with finite time, space, and description length, ie by a TM, the usual sense of 'effective'. If this were the case, the TM could just simulate it, proving it was not a hypercomputer. This is the sense in which the claim is flat-out wrong, aside from the difficulty of trying to evaluate it with 'evidence'.

sdmat t1_iyf7dck wrote on November 30, 2022 at 11:05 PM

#834,164

Replying to -horses (#806,163)

> Hypercomputation confirmed? (Note: we could easily change the last line so further outputs were monotonically increasing and larger than the step number for the current candidate for BB_2(5), while keeping correctness on the first four. Imagine an infinite series such machines, each more cleverly obfuscatory than the last; they exist.)

No, because there is no plausible computational principle giving the answer to the general Busy Beaver problem embodied in that system. Notably, it's a turing machine.

An inductive proof needs to establish that the inductive step is valid - that there is a path from the base case to the result, even if we can't enumerate the discrete steps we would take to get there.

By analogy proof of hypercomputation would need to establish that the mechanism of hypercomputation works for verifiable examples and that this same mechanism extends to examples we can't directly verify.

Of course this makes unicorn taxonomy look down to earth and likely.

> edit: And rewinding a bit, the original claim was that there's an effectively realizable device, that is, one which can be implemented, and whose implementation can be accurately described with finite time, space, and description length, ie by a TM, the usual sense of 'effective'. If this were the case, the TM could just simulate it, proving it was not a hypercomputer. This is the sense in which the claim is flat-out wrong, aside from the difficulty of trying to evaluate it with 'evidence'.

That's a great argument if the universe is Turing-equivalent. That may be the case, but how to prove it?

If the universe isn't Turing-equivalent then it's conceivable that we might be able to set up a hypercalculation supported by some currently unknown physical quirk. Doing so would not necessarily involve infinite dimensions - you are deriving those from the behavior of Turing machines.

An example non-Turing universe is one where Real numbers are physical, I.e. it is fundamentally non-discretizable. I have no idea if that would be sufficient to allow hypercomputation, but it breaks the TM isomorphism.

-horses t1_iyf9a66 wrote on November 30, 2022 at 11:19 PM

#835,070

Replying to sdmat (#834,164)

>No, because there is no plausible computational principle giving the answer to the Busy Beaver problem embodied in that system. Notably, it's a turing machine.

Yes, there is no plausible computational principle giving the answer to the Busy Beaver problem in any system, because it is not computable. The point was that you can't trust a machine simply because it produces the known answers correctly and keeps going the same way.

>Doing so would not necessarily involve infinite dimensions - you are deriving those from the behavior of Turing machines.

You need an infinite resource available to get anywhere past finite automata, you just don't have to actually use infinite resources until you get past TMs. Non-automatic models of computation aren't relevant to measurable behavior of dynamical systems in the real world.

>That's a great argument if the universe is Turing-equivalent. That may be the case, but how to prove it?

No, it isn't. It's an observation that 'effective' has a standard definition which precludes hypercomputation. Any effective computation is simulable by a Turing machine; that's not the physical Church-Turing thesis, it's the vanilla version. (edit: and the reason I put the word in there originally is that any AGI implemented with computers would be in that boat, while many models AGI theorists prefer would not be, but are intended to represent real-world systems that would. Thus, they are often claiming to have effective means to non-effective ends.)

>An example non-Turing universe is one where Real numbers are physical, I.e. it is fundamentally non-discretizable. I have no idea if that would be sufficient to allow hypercomputation, but it breaks the TM isomorphism.

This is an example of falling back on infinite information in finite space. If space is continuous, it contains all the uncomputable reals. If you doubt this requires infinite information, consider that these include the incompressible strings of infinite length. A system moving through such a space would adopt states that require infinite information to describe infinitely often. It still wouldn't allow us to show any hypercomputation, though; our ability to observe and communicate remains finite, and all finite observations are explicable by finite machines, well within computability.

Comments