Submitted by MohamedRashad t3_y14lvd in MachineLearning
I am looking for research papers in this area and I am unable to find anything.
The idea is that I give the model an image and he spits out the text that creates it with high confidence. I think prompt engineering can be the closest thing to what I want but when I searched the latest papers in it I got nothing useful from them.
​
What keywords should I use ? or are there any good papers or tools I need to know about ?
Any help will be appreciated, Thanks in advance.
ReasonablyBadass t1_irvdh0j wrote
That would be Caption Generation, I believe. And has been around for a while.