leonidganzha t1_j1paskb wrote on December 26, 2022 at 8:02 AM

(I'm not a specialist, but you're asking the Reddit, so), yes, generally you got it. If we assume that AGI is aligned, it doesn't actually need a containment layer. If we assume it's misaligned, it will leave a backdoor. So asking it to do it is pointless either way. Maybe it can help, like if the solution is programmatical, obviously it can write the code for it, which we then can check. But the basic idea is, the researchers are trying to find measures to prevent AI from going rogue, which are fundamentally guaranteed to work. Or prove that it's impossible. A prison box is actually not a good solution, because AGI will be smarter than us or smarter than its granddad AGI who built it (assuming it will be constantly evolving). Some people think that if we assume we'll need a box for an AI, then we shouldn't build that AI in the first place.

Adding: Check Robert Miles on YouTube, he goes into great depth explaining these problems. He also retells research papers on the subject you can check yourself.

gaudiocomplex t1_j1prwt4 wrote on December 26, 2022 at 12:19 PM

This is a very helpful answer. Thank you. :)