Comments

You must log in or register to comment.

GFrings t1_iz4r2cr wrote

You could spin up a local instance of cvat, which is a FOSS labelling tool that has a ton of features, and put whatever yoi want through the pipeline.

16

tacixat t1_iz5se4t wrote

Likely no commercial ones since payment processes make it difficult to process payments for porn. There are a ton you could self host though.

2

Craksy t1_iz7b0ol wrote

Yeah, I would also be willing to watch po... ehm, label data for you.
In the name of science.

Seriously though I wonder how the NSFW filters on SD etc work... I wonder how much hand labeled adult material was gathered before it could mostly be automated.

I'm kind of amused by the idea of researchers sitting in their office, all sciency, browsing pornhub and taking notes.

8

TiredOldCrow t1_iz865hz wrote

Mechanical Turk actually allows this. There's a special "Adult Content" Qualification.

1

numorate t1_iz8ipe0 wrote

NSFW means anything from Victoria's Secret ads to Documenting Reality message boards.

Remote contractors on fiverr/upwork etc. can usually do regular porno but if you're working with CP or something you're going to need to hire on site staff.

2

beezlebub33 t1_iz9h7q1 wrote

Before you go through this work, you know that there are existing datasets, right?

See: https://github.com/EBazarov/nsfw_data_source_urls and https://github.com/alex000kim/nsfw_data_scraper for example.

If you want to train a NSFW classifier, use the existing sets first. And use a pre-trained Imagenet classifier first and fine tune it. This will get you 90+% of the way there. It would make sense for you to have your own testing set to make sure that it works for your use-case (CVAT or VoTT work fine), but goodness, don't start from scratch.

3