Viewing a single comment thread. View all comments

Molnan t1_ja5zobe wrote

You say:

​

>CAIS also assumes people won’t build generalist agents to start with.

​

No, it doesn't. See, for instance, section 7: "Training agents in human-like environments can provide useful, bounded services":

​

>Training agents on ill-defined human tasks may seem to be in conflict with developing distinct services provided by agents with bounded goals. Perceptions of conflict, however, seem rooted in anthropomorphic intuitions regarding connections between human-like skills and human-like goal structures, and more fundamentally, between learning and competence. These considerations are important to untangle because human-like training is arguably necessary to the achievement of important goals in AI research and applications, including adaptive physical competencies and perhaps general intelligence itself. Although performing safely-bounded tasks by applying skills learned through loosely-supervised exploration appears tractable, human-like world-oriented learning nonetheless brings unique risks.

​

You say:

​

>if you don’t think a LLM can become dangerous you aren’t thinking hard enough.

​

Any AI can be dangerous depending on factors like its training data, architecture and usage context. That said, LLM as currently understood have a well defined way to produce and compare next token candidates, and no intrinsic tendency to improve on this routine by gathering computing resources or any similar instrumental goals, and simply adding more computing power and training data doesn't change that.

Gato and similar systems are interesting but at the end of the day, the architecture behind useful real-world AIs like Tesla's Autopilot is more suggestive of CAIS than of Gato, and flexibility, adaptability and repurposing are achieved through good old abstraction and decoupling of subsystems.

The advantages of generalist agents are derived from transfer learning. But this is no panacea, for instance, in the Gato paper they admit it didn't offer much advantage when it comes to playing Atari games, and it has obvious costs and drawbacks. For one, the training process will tend to be longer, and when something goes wrong you may need to start over from scratch.

And I must say, if I'm trusting an AI to drive my car, I'd actually prefer it if this AI's training data did NOT include videogames like GTA or movies like, say, Death Proof or Christine. In general, for many potential applications it's reassuring to know that the AI simply doesn't know how do certain things, and that's a competitive advantage in terms of popularity and adoption, regardless of performance.

​

You say:

>Narrow agents can also become dangerous on their own because of instrumental convergence

​

Yes, under some circumstances, and conversely, generalist agents can be safe as long as this pesky instrumental convergence and other dangerous traits are avoided.

There's a lot more to CAIS than "narrow good, generalist bad". In fact, many of Drexler's most compelling arguments have nothing to do with specialist Vs generalist AI. For instance, see section 6: "A system of AI services is not equivalent to a utility maximizing agent", or section 25: "Optimized advice need not be optimized to induce its acceptance".

0