dojoteef t1_j60evd7 wrote on January 26, 2023 at 8:50 PM

I'd guess that it's an easier optimization problem. GANs are known to have stability issues during training, likely due to the adversarial formulation.

I think a more interesting question is why it also performs better than VAEs, since diffusion models also fall under the category of variational inference. Again I'd assume it's an easier optimization problem due to having a large number of denoising steps. Perhaps a technique like DRAW could match diffusion models if used with more steps? Not sure.

HateRedditCantQuitit t1_j60qzvg wrote on January 26, 2023 at 10:05 PM

I always see diffusion/score models contrasted against VAEs, but is there really a good distinction? Especially given latent diffusion and IAFs and all the other blurry lines. I feel like any time you're doing forward training & backwards inference trained with an ELBO objective, it should count as a VAE.

Zealousideal_Low1287 t1_j6191sq wrote on January 27, 2023 at 12:08 AM

I guess for it to really count as a variational autoencoder you need to be reconstructing the input

HateRedditCantQuitit t1_j621uj8 wrote on January 27, 2023 at 3:44 AM

Isn't reconstructing the input exactly what the denoising objective does?