KlutzyLeadership3652 t1_irwt908 wrote on October 11, 2022 at 5:03 PM

Reply to comment by MohamedRashad in [D] Reversing Image-to-text models to get the prompt by MohamedRashad

Don't know how feasible this would be for you but you could create a surrogate model that learns image-to-text. Use your original text-to-image model to generate images given text (open caption generation datasets can give you good examples of captions), and the surrogate model trains to generate the text/caption back. This would be model centric so don't need to worry about many2many issue mentioned above.

This can be made more robust than a backward propagation approach.