Nanaki_TV t1_jcat0g4 wrote on March 15, 2023 at 2:51 PM

It was tried in the ARC trial. So they tried to create a self-replicating AI that could get out of control and lead to AGI

Enough_Evening46422 t1_jcavccy wrote on March 15, 2023 at 3:06 PM

I just read a book about this. "The Metamorphosis of Prime Intellect." More fiction than science but still a pretty interesting singularity book if anyone's interested. Kinda fucked up though so be warned lol

7734128 t1_jcctphe wrote on March 15, 2023 at 10:24 PM

Of course GPT-4 is nowhere close to that level yet, but I love the idea that the way to see if an AI system can escape its confines and go rogue is to give it a bunch of money and encourage it to do so.

That's like testing the max weight capacity of a bridge by driving multiple overloaded trucks on it.

jugalator t1_jcbczxl wrote on March 15, 2023 at 4:56 PM

That's eerie, like how they think this is worth trying now, that it may be within reach. They aren't like redditors falling for the hype either but experts in their field.

TheRidgeAndTheLadder t1_jcc8fva wrote on March 15, 2023 at 8:09 PM

They've mostly left this sub at this stage, but most of the experts on this are also redditors

MustacheEmperor t1_jcc5851 wrote on March 15, 2023 at 7:49 PM

>Preliminary assessments of GPT-4’s abilities, conducted with no task-specific finetuning, found it ineffective at autonomously replicating, acquiring resources, and avoiding being shut down “in the wild.”

>ARC found that the versions of GPT-4 it evaluated were ineffective at the autonomous replication task based on preliminary experiments they conducted. These experiments were conducted on a model without any additional task-specific fine-tuning, and fine-tuning for task-specific behavior could lead to a difference in performance. As a next step, ARC will need to conduct experiments that (a) involve the final version of the deployed model (b) involve ARC doing its own fine-tuning, before a reliable judgement of the risky emergent capabilities of GPT-4-launch can be made

So, don't start collecting canned food yet.

TallOutside6418 t1_jcctehd wrote on March 15, 2023 at 10:22 PM

Yeah, I'm sure the first few efforts to modify bat corona viruses so they could replicate in humans failed too.

GeneralZain t1_jccwk9m wrote on March 15, 2023 at 10:44 PM

how many more times will they have to try though...

[deleted] t1_jccb9kn wrote on March 15, 2023 at 8:26 PM

[deleted]

Lawjarp2 t1_jcb2j0i wrote on March 15, 2023 at 3:51 PM

It scores at the 5th percentile on codeforces. It can barely solve medium hard questions on leetcode.

Most software development doesn't need one to be good at anything mentioned above. But they do indicate ones ability to do leap of logic required to solve something like AGI. GPT-4 is not ready for that yet.

throwawaydthrowawayd t1_jcb48bo wrote on March 15, 2023 at 4:02 PM

Unfortunately, they didn't tell us anything about how they did the codeforces test. It sounds like they just tried zero-shot, had GPT-4 see the problem and immediately write code to solve it. But that's not humans solve codeforces problems, we sit down and think through the problem. In a more real world scenario, I think GPT-4 would do way better at codeforces. Still not as good as a human, but definitely way better than their test.

SoylentRox t1_jcb6ljc wrote on March 15, 2023 at 4:17 PM

They could fine tune it, use prompting or multiple pass reasoning, give it an internal python interpreter. Lots of options that would more fairly produce results closer to what this generation of compute plus model architecture is capable of.

I don't know how well that will do but i expect better than median human as these are the result google got who were using a weaker model than gpt-4.

MustacheEmperor t1_jcc5crl wrote on March 15, 2023 at 7:50 PM

Our CTO and I tried getting it to write some relatively challenging Swift as a benchmark example and it just repeatedly botched it. It would produce something close to working code, but kept insisting on using libraries that didn't have support for what it was trying to do with them, which was also an issue with 3.5.

HurricaneHenry t1_jccs4wi wrote on March 15, 2023 at 10:13 PM

I haven’t tried chatGPT-4, but I was very unimpressed with Bing, which is powered by GPT-4, when asking it to learn Gradio’s API and write some simple code using it. It made multiple weirdly simple errors even with guidance in a short session. It did apologize though.

GPT4 makes functional Flappy Bird AND an AI that learns how to play it.

IntrepidRestaurant88 t1_jcakcq8 wrote on March 15, 2023 at 1:52 PM