ChocolateFit9026

ChocolateFit9026 t1_iqrrbt4 wrote

There’s a big misunderstanding that just because text to image is huge right now that everything will be done with text prompts. The only reason it works this way is because good image text pairs exist everywhere on the internet. Not the same with audio and lots of things.

3