Submitted by TheCockatoo t3_10m1sdm in MachineLearning
dojoteef t1_j60evd7 wrote
I'd guess that it's an easier optimization problem. GANs are known to have stability issues during training, likely due to the adversarial formulation.
I think a more interesting question is why it also performs better than VAEs, since diffusion models also fall under the category of variational inference. Again I'd assume it's an easier optimization problem due to having a large number of denoising steps. Perhaps a technique like DRAW could match diffusion models if used with more steps? Not sure.
HateRedditCantQuitit t1_j60qzvg wrote
I always see diffusion/score models contrasted against VAEs, but is there really a good distinction? Especially given latent diffusion and IAFs and all the other blurry lines. I feel like any time you're doing forward training & backwards inference trained with an ELBO objective, it should count as a VAE.
Zealousideal_Low1287 t1_j6191sq wrote
I guess for it to really count as a variational autoencoder you need to be reconstructing the input
HateRedditCantQuitit t1_j621uj8 wrote
Isn't reconstructing the input exactly what the denoising objective does?
Viewing a single comment thread. View all comments