Viewing a single comment thread. View all comments

Simusid OP t1_jegspl8 wrote

VITMAE isn't a generative model. The intent is to use unlabeled data to train the encoder. After that, the decoder is thrown away. Then (in theory) I would use a relatively small amount of labeled data and the encoder with a new head to do traditional supervised classification.

1