[D] My embarrassing trouble with inverting a GAN generator. Do GAN questions still get answered? ;-)
Submitted by _Ruffy_ t3_yyxsxv in MachineLearning
Hi all!
I'm a fairly advanced Machine Learner, but I struggle with something that sounds rather easy.
I have a fully trained GAN and want to invert the generator. Details on the GAN below. In short, it's a fairly simple GAN, no stylegan or anything fancy.
So I sample a random latent, I pass it through the generator, I get a fake image. I then compute a metric comparing the reference image (for which I want a z for) with the fake image. I backprop this metric value to get a gradient on the latent, which I then update with an optimizer. Sounds easy enough, and "my code works" ™.
The problem is that no matter which of the following combinations of metric and optimizer I try, the fake samples do not converge to anything near the reference image. Yes, the fake image changes a little bit from the initial one, but the optimization comes to a grinding halt fairly quickly.
For metrics I tried L1 and L2 distance as well as LPIPS with VGG as the network. For optimizers I tried SGD, SGD with Momentum and Adam, also playing around with the parameters a bit.
One more thing I tried was I generated 1000 random latents and selected the one that minimizes the metric as the initial one, to try to prevent a bad initial latent that might make the method not work.
I then looked into research and found this survey on gan inversion, where table 1 points me to this work by Creswell et al., where they use a different metric / error, see their algorithm 1. But when trying to implement that, the value quickly gets NaN (even though I add a small epsilon inside the log terms).
I am at a bit of a loss here. What is the standard way of doing this? I feel like I overlook something obvious. Any hints/links/papers greatly appreciated!
GAN details: I trained using the code from https://github.com/lucidrains/lightweight-gan, image size is 256, attn-res-layers is [32,64], disc_output_size is 5 and I trained with AMP.
autoencoder t1_iwwyabp wrote
> the value quickly gets NaN
Sounds like numerical issues. That could be caused by a too high learning rate.
What does your training error look like across iterations? If it jumps all over the place (increasing a lot maybe), then it's too high, as the step sizes overshoot their targets repeatedly.