Submitted by AutoModerator t3_110j0cp in MachineLearning
theidiotrocketeer t1_j9e9iw9 wrote
Is it psychotic to use a GPT based model for what could be treated as image segmentation?
For my task, I trained a GPT model to predict a mask for an Input Integer Matrix with certain rows being entirely a spurious value. Where the mask is replacing the spurious integers with X's. It is a text based model for what could be considered an image task.
activatedgeek t1_j9lobdv wrote
It is not uncommon anymore to model images as patches of tokens, and then send in the sequence to a transformer-based model. So not psychotic at all.
See An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale.
Viewing a single comment thread. View all comments