Viewing a single comment thread. View all comments

signed7 t1_ja9h7mw wrote

You think that'd be worse than human extinction?

7

bluehands t1_ja9ka72 wrote

Sure thing.

Are you familiar with I Have No Mouth, and I Must Scream?

Rouge ASI could kill us all but a terrible person with an oracle ASI could make a factual, literal - as in flesh, blood & fire - hell on earth. Make people live forever in pain & suffering, tortured into madness and then restored to a previous state, ready to be tortured again.

A rouge ASI that wants us all dead isn't likely to care about humanity at all, we are just a misplace anthill. But we all know terrible people in our lives and the worst person you know is a saint next the worst people in power.

Tldr: we are going to create a genie. In the halls of power there are many Jafars and few Aladdins.

5

drsimonz t1_ja9s2mx wrote

Absolutely. IMO almost all of the risk for "evil torturer ASI" comes from a scenario in which a human directs an ASI. Without a doubt, there are thousands, possibly millions, of people alive right who would absolutely create hell, without hesitation, given the opportunity. You can tell because they....literally already do create hell on a smaller scale. Throwing acid on women's faces, burning people alive, raping children, orchestrating genocides, it's been part of human behavior for millennia. The only way we survive ASI is if these human desires are not allowed to influence the ASI.

2

turnip_burrito t1_jablzeb wrote

In addition, there's also a large risk of somebody accidentally making it evil. We should probably stop training on data that has these narratives in it.

We shouldn't be surprised when we train a model on X, Y, Z and it can do Z. I'm actually surprised that so many people are surprised at ChatGPT's tendency to reproduce (negative) patterns from its own training data.

The GPTs we've created are basically split personality disorder AI because of all the voices on the Internet we've crammed into the model. If we provide it a state (prompt) that pushes it to some area of its state space, then it will evolve according to whatever pattern that state belongs to.

tl;dr: It won't take an evil human to create evil AI. All it could take is some edgy 15 year old script kid messing around with publicly-available near-AGI.

1