Submitted by AutoModerator t3_xznpoh in MachineLearning
Sbadabam278 t1_is75ja9 wrote
How can I learn the theory behind diffusion models (and stable diffusion) properly?
I have read the papers, but to me they gloss over a huge amount of information and are hard to make sense of at the moment.
Let’s take the original diffusion paper “deep unsupervised learning using non equilibrium thermodynamics “
They start with a data point x0 and then apply a “markov diffusion kernel” (aka adding a zero mean Gaussian random variable) for T times until we converge to a fixed distribution (also normal). Then they want to learn a “reverse distribution” p that inverts the process, by learning mean and variance for the reverse process distribution at each step.
So first of all, we already know mean and variance of each step. Why are you trying to estimate them? Are we trying to find “fake” mean and variance which push the stable state towards the “manifold” of realistic looking data points? If so, some other things in the paper don’t make sense to me (things like “the forward and reversal process are identical if the variance is small” - wtf are you talking about)
Another point is: what is the significance of this process in the first place? The forward process is mathematically equivalent to just add a single Gaussian random variable with higher variance. Why is having many steps important, and why can’t we learn to demonize directly from the final state in a single step?
There are many more questions I have about the paper, so my main question is: how do people make sense of it? I’m having a hard time even finding out which topics I should research.
I’m not an expert in probability / markov chains / math in general, but I think I can say I’m not a complete newbie either. What is the expected background one should have to read and understand these articles, and do you have any pointers on how to do that?
Thanks!
C0hentheBarbarian t1_is96iqz wrote
Highly recommend this post by Jay Alammar. He has one of the best tutorials on how transformers work too (IMO) and this one is up there. I have worked with CV very sporadically recently but his post along with some of the links he has on there explained things to me pretty well. The only math background I can recommend off the top of my head is the probability calculation for lower/upper bounds - you can look up how VAEs work there or the post I linked has resources to understand the same.
Sbadabam278 t1_iseer5o wrote
Thank you for the resources, it is a nice explanation! However, I was looking for more of a technical understanding - which topics I should read in order to follow and understand the original paper?
C0hentheBarbarian t1_iss3b1t wrote
Suggest you look at some of the links in the article.. some discuss the math behind diffusion models in detail which should let you understand the paper.
Viewing a single comment thread. View all comments