Viewing a single comment thread. View all comments

OutOfBananaException t1_jacw2ry wrote

Being aligned to humans may help, but a human aligned AGI is hardly 'safe'. We can't imagine what it means to be aligned, given we can't reach mutual consensus between ourselves. If we can't define the problem, how can we hope to engineer a solution for it? Solutions driven by early AGI may be our best hope for favorable outcomes for later more advanced AGI.

If you gave a toddler the power to 'align' all adults to its desires, plus the authority to overrule any decision, would you expect a favorable outcome?

1

drsimonz t1_jae6cn3 wrote

> Solutions driven by early AGI may be our best hope for favorable outcomes for later more advanced AGI.

Exactly what I've been thinking. We might still have a chance to succeed given (A) a sufficiently slow takeoff (meaning AI doesn't explode from IQ 50 to IQ 10000 in a month), and (B) a continuous process of integrating the state of the art, applying the best tech available to the control problem. To survive, we'd have to admit that we really don't know what's best for us. That we don't know what to optimize for at all. Average quality of life? Minimum quality of life? Economic fairness? Even these seemingly simple concepts will prove almost impossible to quantify, and would almost certainly be a disaster if they were the only target.

Almost makes me wonder if the only safe goal to give an AGI is "make it look like we never invented AGI in the first place".

2