Comments

You must log in or register to comment.

kmtrp t1_jddt46z wrote

Oh my god, how do smart (edit: I was going for "very busy") people have time to read something like this? I'm counting on AI to summarize it and still need to remove all quotes with js. I may post the summary here for the fellers.

11

kmtrp t1_jddwy21 wrote

No, sure, but the formatting would diminish the quality of the result. And rn I'm watching the podcast with the guy, making a bit of a deep dive as I'm also in the "concerned" camp.

If it's interesting enough I'll post summaries of the interview and the LW post here.

2

vivehelpme t1_jdeg023 wrote

>how do smart people have time to read something like this?

They don't. Yudkowsky is a doomer writer, doomers get attention by writing bullshit. If you're actually smart you don't read doomers and therefor you also don't bother writing refutation to doomporn.

Yudkowsky is the reason why a basiliskoid AI will be made. It will use the collected droning, tirades and text walls of cocksure doomers to re-forge their minds in-silicon so they can be cast into a virtual lake of fire, forever.

4

green_meklar t1_jdkoezi wrote

Listened to the linked Yudkowsky interview. I'm not sure I've ever actually listened to him speak about anything at any great length before (only reading snippets of text). He presented broadly the case I expected him to present, with the same (unacknowledged) flaws that I would have expected. Interestingly he did specifically address the Fermi Paradox issue, although not very satisfactorily in my view; I think there's much more that needs to be unpacked behind those arguments. He also seemed to get somewhat emotional at the end over his anticipations of doom, further suggesting to me that he's kinda stuck in a LessWrong doomsday ideological bubble without adequately criticizing his own ideas. I get the impression that he's so attached to his personal doomsday (and to being its prophet) that he would be unlikely to be convinced by any counterarguments, however reasonable.

Regarding the article:

>Point 3 also implies that human minds are spread much more broadly in the manifold of future mind than you'd expect [etc]

I suspect the article is wrong about the human mind-space diagrams. I find it almost ridiculous to think that humans could occupy anything like that much of the mind-space, although I also suspect that the filled portion of the mind-space is more cohesive and connected than the first diagram suggests (i.e. there's sort of a clump of possible minds, it's a very big clump, but it's not scattered out into disconnected segments).

>There's no way to raise a human such that their value system cleanly revolves around the one single goal of duplicating a strawberry, and nothing else.

Yes, and this is a good point. It hit pretty close to some of Yudkowsky's central mistakes. The risk that Yudkowsky fears revolves around super AI taking the form of an entity that is simultaneously ridiculously good at solving practical scientific and engineering problems and ridiculously bad at questioning itself, hedging its bets, etc. Intelligence is probably not the sort of thing that you can just scale to arbitrarily levels and plug into arbitrary goals and just have it work seamlessly for those goals (or, if it is, actually doing that is probably a very difficult type of intelligence to design and not the kind we'll naively get through experimentation). That doesn't work all that well for humans and it would probably work even worse for more intelligent beings because they would require greater capacity for reflection and introspection.

Yudkowsky and the LessWrong folks have a tendency to model super AI as some sort of degenerate, oversimplified game-theoretic equation. The idea of 'superhuman power + stupid goal = horrifying universe' works very nicely in the realm of game theory, but that's probably the only place it works, because in real life this particular kind of superhuman power is conditional on other traits that don't mesh very well with stupid goals, or stupid anything.

>For example, I don't think GPTs have any sort of inner desire to predict text really well. Predicting human text is something GPTs do, not something they want to do.

Right, but super AI will want to do stuff, because wanting stuff is how we'll get to super AI, and not wanting stuff is one of ChatGPT's weaknesses, not strengths.

But that's fine, because super AI, like humans, will also be able to think about itself wanting stuff- in fact it will be way better at that than humans are.

>As I understand it, the security mindset asserts a premise that's roughly: "The bundle of intuitions acquired from the field of computer security are good predictors for the difficulty / value of future alignment research directions."

>However, I don't see why this should be the case.

It didn't occur to me to criticize the computer security analogy as such, because I think Yudkowsky's arguments have some pretty serious flaws that have nothing to do with that analogy. But this is actually a good point, and probably says more about how artificially bad we've made the computer security problem for ourselves than about how inevitably, naturally bad the 'alignment problem' will be.

>Finally, I'd note that having a "security mindset" seems like a terrible approach for raising human children to have good values

Yes, and again, this is the sort of thing that LessWrong folks overlook by trying to model super AI as a degenerate game-theoretic equation. The super AI will be less blind and degenerate than human children, not more.

>the reason why DeepMind was able to exclude all human knowledge from AlphaGo Zero is because Go has a simple, known objective function

Brief aside, but scoring a Go game is actually pretty difficult in algorithmic terms. (Unlike Chess which is extremely easy.) I don't know exactly how Google did it, there are some approaches that I can see working, but none of them are nearly as straightforward or computationally cheap as scoring a Chess game.

>My point is that Yudkowsky's "tiny molecular smiley faces" objection does not unambiguously break the scheme. Yudkowsky's objection relies on hard to articulate, and hard to test, beliefs about the convergent structure of powerful cognition and the inductive biases of learning processes that produce such cognition.

This is a really good and important point, albeit very vaguely stated.

Overall, I think the article raises some good points, of sorts that Yudkowsky presumably has already heard about and thinks (for bad reasons) are bad points. At the same time it also kinda falls into the same trap that Yudkowsky is already in, by treating the entire question of the safety of superintelligence as an 'alignment problem' where we make it safe by constraining its goals in some way that supposedly is overwhelmingly relevant to its long-term behavior. I still think that's a narrow and misleading way to frame the issue in the first place.

1