ml_head t1_jcjyon2 wrote on March 17, 2023 at 11:43 AM

Reply to comment by Empty-Revolution7570 in [P] Multimedia GPT: Can ChatGPT/GPT-4 be used for vision / audio tasks just by prompt engineering? by Empty-Revolution7570

I'm sure that it does. And would beca better demo of the technology. Maybe, keep the Cinderella story too, since some people wouldn't read your original story and wouldn't be able to tell if the summary is good. You might want to add an image with your original story in a format that wouldn't be easy to OCR, like using weird font on noisy background. In this way you are making the story available to humans but taking measures to hide it from any web crawler used by language models.

ml_head t1_jcf3e64 wrote on March 16, 2023 at 11:41 AM

Reply to [P] Multimedia GPT: Can ChatGPT/GPT-4 be used for vision / audio tasks just by prompt engineering? by Empty-Revolution7570

So, the model recognized the Cinderella story in the audio. But how do we know that summary was generated from the audio, and not from prior knowledge of the story? I know that those models can do this task. However, for the demo I would use an original story instead.