jiamengial t1_j6mdcrj wrote on January 31, 2023 at 10:23 AM

Don't think so, diffusion models are based entirely on sampling methods; if anything what's exciting is to take the "traditional" methods and, instead of replacing the whole thing with neural nets, replace only a component of it

arg_max t1_j6mg664 wrote on January 31, 2023 at 11:02 AM

I think diffusion models are kind of a bad example. The SDE paper from Yang Song has shown that it's all about modeling the score function and this can't be done with simple models. Apart from that, the big text2img models work inside the latent space of a deep vae, make use of conditioning using cross attention which isn't a thing in traditional ML and use large language models to process the text input. All their components are very dl based.

kpalan t1_j6n8otw wrote on January 31, 2023 at 3:16 PM

The main way diffusion uses to predict added noise is with deep convolutional neural networks,
Furthermore, stable diffusion specifically uses even more deep CNNs to downscale,upscale the image.