luffreezer
luffreezer t1_j9xts7d wrote
Reply to Likelihood of OpenAI moderation flagging a sentence containing negative adjectives about a demographic as 'Hateful'. by grungabunga
This is just a mirror of who gets the most hatespeech.
It says more about human discourse than it says about the AI.
Edit: here is a small paragraph from the conclusion of the Article that I think is important to keep in mind:
«It is also important to remark that most sources for the biases reported here are probably unintentional and likely organically emerging from complex entanglements of institutional corpora and societal biases. For that reason, I would expect similar biases in the content moderation filters of other big tech companies.»
luffreezer t1_iurc0vf wrote
Reply to Robots That Write Their Own Code by kegzilla
It's happening boys !
luffreezer t1_j9yo44f wrote
Reply to comment by Baturinsky in Likelihood of OpenAI moderation flagging a sentence containing negative adjectives about a demographic as 'Hateful'. by grungabunga
It is the whole internet that is like that. As a said, it is a reflexion of our society:
You will never find people insulting "normal weighted people" or "people without a disability". So it is not surprising that the model does not perform well in those areas.
In the US, saying something is "socialism" can even be interpreted as a criticism, so I am not surprised it flags more left-winged things than right-winged.