Submitted by WallabyDue2778 t3_y92tln in MachineLearning
I’m reading up on diffusion models, and these seem to be two dominant approaches. They are also equivalent in some parameterization of the formulations. However, the more recent papers, for example stable diffusion, seem to use DDPM-type formulation more often, and by this I mean they learn the noise rather than the score.
Is this observation true? And if it is, what are some reasons? I’ve never implemented a model like this myself, so I don’t know how difficult or practical they are. Perhaps all the issues listed in the score matching papers (manifold hypothesis, low data density regions, inaccurate score estimation) make it really difficult to work with, or is there something more fundamental?
Thanks in advance!
dasayan05 t1_it46pby wrote
>... these seem to be two dominant approaches ...
Totally. There are two streams of ideas, similar but not exactly equivalent, namely Score-Based Models (SBM) and Denoising Diffusion Probabilistic Models (DDPM). There is an effort to unify these two under the umbrella of Stochastic Differential Equations (SDE), where SBM -> "Variance Exploding SDE" and DDPM -> "Variance Preserving SDE". By far, DDPM is more famous -- reason is, DDPM has stronger theoretical gurantees and less hyperparameters. SBMs are, in some parts, intuitive and observation-based.
>.. they learn the noise rather than the score ..
Yes. SBM uses "score" while DDPM uses "noise-estimates"; but they are related -- "score = - eps / noise-std" see CVPR22's Diffusion slides (slide 57). IMO, the major difference between SBM and DDPM is their forward noising process -- SBM only adds noise -- DDPM adds noise as well as attenuates the signal and this process is systematically "tied" to the noise schedule \beta_t. This makes the reverse process look slightly different.
If you want to implement Diffusion Models, start with DDPM as formulated by Ho et al. I have never seen an algorithm written so clearly as the one in Ho et al's Algorithm 1 & 2. It can't get any simpler in terms of implementation.