Viewing a single comment thread. View all comments

DeezNUTSampler t1_itq1l2d wrote

Can you link works in Computer Vision SSL which incorporate this principle “use model’s high confidence outputs on easy examples to train it on hard examples”? It is not obvious to me how this would work. For example, in contrastive learning the objective is to learn view invariant representations. Two views of an object, augmented differently, are pushed together in representation space by minimizing the distance between them as our loss function. Which one would constitute the easy/hard example here?

5

say_wot_again t1_itrmhsx wrote

Here's an example of what I had in mind. Pseudolabels for unlabeled data are generated on the clean images, but the student model is trained on a strongly augmented version of the image. It's not contrastive learning because the objective is still explicitly object detection, but instead easy vs hard is the original image vs the strongly augmented one.

3