milleniumsentry t1_is13giv wrote on October 12, 2022 at 3:15 PM

Reply to comment by visarga in [D] Reversing Image-to-text models to get the prompt by MohamedRashad

I honestly think this will be a step in the right direction. Not actually for prompt sharing, but for refinement. These networks will start off great at telling you.. that's a hippo.... that's a potato.. but what happens when someone wants to create a hippotato...

I think without some sort of tagging/self reference, the data runs to risk of self reinforcement... as the main function of the task is to bash a few things together into something else. At what point will it need extra information so that it knows, yes.. this is what they wanted... this is a good representation of the task...

A tag back loop would be phenomenal. Imagine if you ask for a robotic cow with an astronaut friend. Some of those image, will be lacking robot features, some won't look like cows... etc. Ideally, your finished piece would be tagged as well... but perhaps missing the astronaut... or another part of the initial prompt request. By removing tags that were not generated by the prompt, the two can be compared for a soft 'success' rate.