Viewing a single comment thread. View all comments

taichi22 t1_j1ln84a wrote

That’s… a very interesting conjecture. Given that language models are essentially open ended, enough negative bias in the training dataset could ultimately create a machine that does act in a destructive or subversive manner. See: Tay.

Unlikely, given that we will be tuning, but if we ever get to a point where models are tuning models, or if we use unstructured datasets that will definitely be something to guard against.

3