master3243 t1_irhlkct wrote
Pretty cool, I just tested with the prompt
"a DSLR photo of a teddy bear riding a skateboard"
Here's the result:
https://media.giphy.com/media/eTQ5gDgbkD0UymIQD6/giphy.gif
Reading the paper and understanding the basics of how it worked, I would have guessed that it would have a tendency to create a Neural Radiance Field where the front of the object is duplicated over many different camera angles, since updating the NeRF from a different angle the diffusion model will output an image that closely matches an already created angle from before.
I think imagen can prevent this simply because of it's sheer power such that even if given a noisy image of the backside of a teddy bear it can figure out that it truly is the backside and not just the front again. Not sure if that made sense, I did a terrible job articulating the point.
Viewing a single comment thread. View all comments