Submitted by These-Assignment-936 t3_10xjwac in MachineLearning
the_new_scientist t1_j7vu5fk wrote
Yes, the DINO paper showed that the ability to perform segmentation emerges from self-supervised vision transformers.
https://arxiv.org/abs/2104.14294
Edit: oops, didn't realize you said image generation models, thought you asked for just vision models.
irulenot t1_j7xxq3z wrote
Yes this!! Sorry didn’t see it
Viewing a single comment thread. View all comments