M4xM9450 t1_is480yg wrote on October 13, 2022 at 4:30 AM

Diffusion seems to be taking somewhat of a lead on GANs due to being more stable to train. It has the same generative applications of GANs with the downside of being “slower” (I need to research that claim a bit more for the details).

Atom_101 t1_is4b95t wrote on October 13, 2022 at 5:05 AM

Diffusion is inherently slower than GANs. It takes N forward passes vs only 1 for GANs. You can use tricks to make it faster, like latent diffusion which does N forward passes with a small part of the model and 1 forward pass with the rest. But as a method diffusion is slower.

SleekEagle t1_is69o54 wrote on October 13, 2022 at 4:37 PM

There are models that use continuous DE's rather than discrete iterations, both with Diffusion-adjacent methods like SMLD/NCSN and distinct methods!

Atom_101 t1_is6igij wrote on October 13, 2022 at 5:34 PM

Just when I thought I understand the math behind DMs, they went ahead and added freaking DEs to it? Guess I should have paid more attention to math in college.

SleekEagle t1_is6trdz wrote on October 13, 2022 at 6:47 PM

I'm really excited to see what the next few years bring. I've always felt like there's a lot of room for growth from higher level math, and it seems like that's beginning to happen.

I'll have a blog coming out soon on another physics-inspired model like DDPM, stay tuned!

puppet_pals t1_is5aad2 wrote on October 13, 2022 at 12:26 PM

people are also running with distillation too

gwern t1_is5h3xj wrote on October 13, 2022 at 1:21 PM

You can distill GANs too, though, so the performance gap remains, and applications may still be out of reach - distilling a GAN may get you realtime synthesis but distilling a diffusion still only typically makes it as fast as a GAN (maybe).

pm_me_your_ensembles t1_is5ppy3 wrote on October 13, 2022 at 2:24 PM

It's entirely possible to do a lot better than N forward passes.

Atom_101 t1_is5z87v wrote on October 13, 2022 at 3:28 PM

How so?

pm_me_your_ensembles t1_is6ttve wrote on October 13, 2022 at 6:48 PM

Afaik self conditioning helps with the process, and there has been a lot of work in reducing the number of steps through distillation and quantization.

Atom_101 t1_is6wyys wrote on October 13, 2022 at 7:08 PM

I see. Thanks!

Quaxi_ t1_is4woj1 wrote on October 13, 2022 at 9:56 AM

I wouldn't say it is primarily because it is more stable though, it just gives better results and the properties of diffusion easily leads into other applications like in/outpainting and multimodality.

GANs are quite stable these days. Tricks like feature matching loss, spectral normalization, gradient clipping, TTUR etc makes modal collapse quite rare.

You're correct that it is quite slower at the moment though. The diffusion process needs to iterate per pass and thus takes longer both to train and to infer.

Atom_101 t1_is61h1l wrote on October 13, 2022 at 3:43 PM

I doubt it's anywhere close to diffusion models though. Haven't worked with ttur and feature matching. But have tried spectral norm and wgan+gp. They can be unstable in weird ways. In fact, while wasserstein loss is definitely more stable, it massively slows down convergence compared to standard dcgan loss.

The biggan paper by Google tried to scale up GANs by throwing every known stabilization trick at them. They observed that even with these tricks you can't train beyond a point. BigGANs start degrading when trained too much. Granted it came out in 2018, but if this didn't hold true today we would have 100B parameter GANs already. I think the main advantage with DMs is that you can keep training them for an eternity without worrying about performance degradation.

Quaxi_ t1_is7gnsf wrote on October 13, 2022 at 9:14 PM

No definitely - GANs can still fail and they are much less stable than Diffusion models. But GANs have still enjoyed a huge popularity despite that and research has found ways to mitigate it.

I just think it's not the main reason why diffusion models are gaining traction. If it was we probably would have seen a lot more of Variational Autoencoders. My work is not at BigGAN or DALLE2 scale though so might indeed miss some scaling aspect of this. :)

Atom_101 t1_is7ldte wrote on October 13, 2022 at 9:45 PM

I think VAEs are weak not because of scaling issues but , because of an overly strong bias that the latent manifold has to be a Gaussian distribution with a diagonal covariance matrix. This problem is reduced using things like variational quantization. Dalle-1 actually used this, before DMs came to be. But even then, I believe they are too underpowered. Another technique of image generation is normalising flows which also require heavy restrictions on model architecture. GANs and DMs are much more unrestricted and can model arbitrary data distributions.

Can you point to an example where you see GANs perform visibly worse? Although we can't really compare quality between sota GANs and sota DMs. The difference in scale is just too huge. There was a tweet thread recently, regarding Google imagen iirc, which showed that increasing model size drastically improves image quality for text-to-image DMs. Going from 1B to 10B params showed visible improvements. But if you compare photorealistic faces generated by stable diffusion and say stylegan3, I am not sure you would be able to see differences.

ginsunuva t1_is76xr8 wrote on October 13, 2022 at 8:12 PM

GANs are not just for images though

spring_m t1_is4biri wrote on October 13, 2022 at 5:08 AM

Diffusion models are the new cool kid on the block which means a lot of research interest but also a lot of low hanging fruit. I don’t think the ideas of GANs will be completely obsolete though. For example the VAE in stable diffusion is trained using an adversarial loss - without it the decoded images would be much more blurry.

badabummbadabing t1_is4vg29 wrote on October 13, 2022 at 9:39 AM

GANs may be losing some ground to diffusion models in generative tasks, but the idea of playing an adversarial game with a learnable loss function is more general than generating pretty pictures.

SleekEagle t1_is6a5ce wrote on October 13, 2022 at 4:41 PM

Exactly! While they might be losing out in specific applications, the concept itself is still very valuable imo

BrotherAmazing t1_is8rh9t wrote on October 14, 2022 at 3:01 AM

I agree with the assessment of adversarial games and their wide ranging utility, but to be fair, diffusion models can be applied to many tasks in which one requires a generative model; i.e., they’re useful in a much more wide ranging set of applications than just “generating pretty pictures”.

DigThatData t1_is4jiv3 wrote on October 13, 2022 at 6:46 AM

totally still relevant. also, you never know when an older line of research will experience some innovation and renew interest. That's basically what happened with diffusion.

M4xM9450 t1_is5xmdb wrote on October 13, 2022 at 3:18 PM

It’s also what kinda happened to neural networks too! Some breakthroughs just need time.

DigThatData t1_is60e8i wrote on October 13, 2022 at 3:36 PM

touche!

ThatInternetGuy t1_is46ghv wrote on October 13, 2022 at 4:14 AM

Transformer-based models are gaining traction since 2021 for generative models as you could practically scale up to tens of billions of parameters, whereas GAN-based models are already saturated, not that GAN(s) were any less powerful, as GAN(s) are generally much more efficient in terms of performance and memory.

GrimmigerDienstag t1_is4vr3o wrote on October 13, 2022 at 9:44 AM

GAN you hear me is a very recent paper comparing GAN speech synthesis to diffusion, seems to work out fine.

NotMyMain007 t1_is54frt wrote on October 13, 2022 at 11:31 AM

In my opinion the main problem with GANS is how unstable they are, you will spend a lot of time just to make it sure it won't output garbage. Taking only that in consideration is already enough to me to enjoy diffusion more.

ThrowThisShitAway10 t1_is5jqcy wrote on October 13, 2022 at 1:41 PM

People have already got all the low-hanging fruit for GANs. Right now people are doing the same thing with diffusion models. So as long as you are okay with that, yeah they are still a relevant research topic. You might just have a harder time succeeding.

MOSFETBJT t1_is6brc5 wrote on October 13, 2022 at 4:51 PM

In data science, subjects like this go through ebbs and flows of usage and popularity. In early 2000s SVMs were all the rage.

BrisklyBrusque t1_is7g6gk wrote on October 13, 2022 at 9:11 PM

SVMs prevailed against neural networks in a big image classification contest in 2006. Then they fell out of favor, with other learning algorithms like

•Adaboost

•C4.5

•Decision stumps

•Multivariate adaptive regression splines

•Flexible discriminant analysis

•Arcing

•Wagging

Not sure which of these will come back, but it’s funny how often ideas are rediscovered (like neural networks themselves, which were branded as multilayer perceptrons initially)

[deleted] t1_is6j0n4 wrote on October 13, 2022 at 5:38 PM

[removed]

BrisklyBrusque t1_is7fdk7 wrote on October 13, 2022 at 9:06 PM

Anything is relevant as a research topic if you’re passionate about it. Just be prepared to stand up for yourself when you go against the grain. The harshest critics will label your research useless if they are convinced that better methods are available.

Dylan_TMB t1_is86tpg wrote on October 14, 2022 at 12:24 AM

What are you generating? That's the important question to ask. Just because some models have become the best for one takes doesn't mean it is the best for another task. It may be obsolete for your problem but that doesn't make it obsolete generally👍

skelly0311 t1_is8d9xa wrote on October 14, 2022 at 1:12 AM

ELECTRA, which is a transformer variant of BERT uses a GAN in the pre training phase in order to get rid of the mask tokens discrepancy from transforms such as BERT and RoBERTa.

https://arxiv.org/abs/2003.10555

ats678 t1_issnre9 wrote on October 18, 2022 at 12:38 PM

From an Applied research perspective, there’s still a lot of work done with GANs or reusing some of the concepts of Adversarial Learning (I believe some diffusion models actually use a type of adversarial loss during training). Although diffusion models showed to perform extremely well in various tasks, there’s still a lot of work to be done in order to make them usable in practical contexts: first of all, the hardware requirements to train them are extremely expensive (stable diffusion for instance used 256 GPUs to train their model), then these are also extremely large to be deployed for inference. These are all factors that in an applied context might make you use a GAN instead of a diffusion model (at least for now, you never know what people will find out in the next couple of months!)

[D] Are GAN(s) still relevant as a research topic? or is there any idea regarding research on generative modeling?

Comments