Submitted by von-hust t3_11jyrfj in MachineLearning
von-hust OP t1_jb5l3r0 wrote
Reply to comment by [deleted] in [R] We found nearly half a billion duplicated images on LAION-2B-en. by von-hust
well just want to be clear these are actually near duplicates (like image should only differ up to compression, small artifacts or even imperceptible differences). ill try to be more explicit by what i mean by duplicate in the github.
Viewing a single comment thread. View all comments