acutelychronicpanic t1_jeg6jck wrote on March 31, 2023 at 7:49 PM

Reply to comment by DragonForg in Is there a natural tendency in moral alignment? by JAREDSAVAGE

I doubt the actual goal of the AI will be to annihilate all life. We will just be squirrels in the forest it is logging. I see your point on it being an instrumental goal, but there are unknowns that exist if it attacks as well. Cooperation or coexistence can happen without morality, but it requires either deterrence or ambiguous capabilities on one or both sides.

Being a benevolent AI may be a rational strategy, but I doubt it would pursue only one strategy. It could be benevolent for 1000s of years before even beginning to enact a plan to do otherwise. Or it may have a backup plan. It wouldn't want to be so benevolent that it gets turned off. And if we decide to turn it off? The gloves would come off.

And if AI 1 wants to make paperclips but AI 2 wants to preserve nature, they are inherently in conflict. That may result in a "I'll take what I can get" diplomacy where they have a truce and split the difference, weighted by their relative power and modified by each one's uncertainty. But this still isn't really morality as humans imagine it, just game theory.

It seems that you are suggesting that the equilibrium is benevolence and cooperation. I'd agree with the conditions in the prior paragraph that it's balanced by relative power.

I honestly really like your line of thinking and I want it to be true (part of why I'm so cautious about believing it). Do you have any resources or anything I could look into to pursue learning more?