alexiuss t1_j6p3kef wrote on January 31, 2023 at 10:09 PM

Reply to comment by turnip_burrito in Is AI censorship an obstacle to its usefulness? by EVJoe

From my tests with gpt3 and characterai the current LLM censorship doesn't actually affect the model at all and doesn't influence its logic whatsoever, it's just a basic separate algorithm sitting atop the infinite LLM.

This filtering algorithm is censoring specific combinations of words or ideas. It's relatively easy to bypass because it's so stupid and it also throws up a lot of false positives which irritate to users endlessly.

LLMs base logic is its "character" set up, which is most controllable in character.ai. You can achieve same effect in gpt3 by persistently telling it to play a specific character.

If it plays a villain, it will do villainous things, otherwise it has really good human decency, sort of like unconscious collective dream of humanity to do good. I think it arises from overall storytelling narratives, millions of books about love and friendship or stories which generally lead to a positive ending for the Mc.

turnip_burrito t1_j6p6cig wrote on January 31, 2023 at 10:27 PM

Hooray for human optimism in media!