Tobislu t1_irw5pz4 wrote on October 11, 2022 at 2:25 PM

Reply to comment by lefnire in what jobs will we have post singularity? by theferalturtle

Different AI aren't judging themselves

Do you find it odd that human beings police other human beings?

We're distinct, and capable of judging when behavior is outside our accepted norms. As long as its primary function is not a nightly build, based off of the AI it's judging, it should be as objective as we are

lefnire t1_irwn38m wrote on October 11, 2022 at 4:23 PM

I want to preface by saying: I'm one of the most optimistic AI people you'll meet. I call their current work "creative". I think they'll be conscious; I think they already are. The sky's the limit. But when it comes to getting training data (their food), I wonder if it's a conceptual (not skill-based) impossibility. Think of humans farming, a necessary evil. We transcended the animal kingdom, but we need them for our "foundation" (sustenance). Only recently are we creating synthetic food (eg lab-grown meat), so maybe data-labelling isn't impossible in the final analysis. Or maybe AI transcends supervised learning to unsupervised / semi-supervised (the equivalent of us transcending calories). We'll see, I'm just spit-balling what seems to be a chicken/egg issue; not a capacity issue.

I'll give you an example of the best we currently have with transfer learning (AI -> AI) in NLP. DistillBERT is a slimmed down language model equivalent of its more powerful counterpart (whatever it is you choose as that counterpart model you're trying to approximate, in the BERT family). The original "lossless" model is called the teacher, and the new "lossy" model is called the "student". It's like zipping a model, basically. They way it works is the teacher is trained on a human-created dataset. It does its thing, the student watches it in action (inference) and tries to learn the heuristics without learning the details.

But even the teacher needed human-created training data.

Closer to your "policing" analogy, these creativity-based models (like DALLE-2, Stable Diffusion, etc) are called Generative Adversarial Networks - or just Generative models. They use an Actor/Critic paradigm, where one half of the model (the actor, think right hemisphere) creates the art; and the other half (the critic, think left hemisphere) judges that as legit or sloppy. It's actually trying to judge it as real (human-created) or not (AI-created). So that's closer to your policing analogy. HOWEVER! Even here, the actor fully required human art to train. In no way could it have bootstrapped even a little without the original dataset of human art. But now it can take its training wheels off, and away it goes.

Any way you spin AI->AI training, these things have names and they're not what you think. Actor/critic, distillation, transfer learning, zero-shot learning, few-shot learning, etc. The absolute closest to what you're implying is zero-shot learning, and they way that works is by taking a trained model in one domain, have it predict in this new domain (based on its irrelevant skill), make an analogy from that to this domain, and use that as training data. Per previous, a common example is this. New domain is classification (cat, dog, or tree). Old domain is next-word prediction (masked language modeling), eg "I like when my [MASK] purrs". Predict the mask for the current text, use as training data to train a classifier. But again.... seeing the problem yet?

What we'd need is a full switch from supervised learning to reinforcement learning models running shop, which are learning in the world from their own experience, to provide the training data for any supervised learning models left around.

Tobislu t1_irwo2fi wrote on October 11, 2022 at 4:29 PM

Honestly, I think you made me more confused 😅

lefnire t1_irwv344 wrote on October 11, 2022 at 5:14 PM

My apologies. I don't like it when people use word-salad to strong-arm a debate, I just meant to explain the situation from the trenches.