saw79 t1_ir13tsx wrote on October 4, 2022 at 3:57 PM

In addition to other commenter's [good] point about your nebulous "visual quality" idea, a couple other comments on what you're seeing:

Frankly, your generative model doesn't seem very good. If your generated samples don't look anything like CIFAR images, I would stop here. Your model's p(x) is clearly very different from CIFAR's p(x).
Why are "standard"/discriminative models' confidence scores high? This is a hugely important drawback of discriminative models and one reason why generative models are interesting in the first place. Discriminative models model p(y|x) (class given data), but don't know anything about p(x). Generative models model p(x, y) = p(y|x) p(x); i.e., they generally have access to the prior p(x) and can assess whether an image x can even be understood by the model in the first place. These types of models would (hopefully, if done correctly), give low confidence on "crappy" images.

ThoughtOk5558 OP t1_ir17ijo wrote on October 4, 2022 at 4:21 PM

I intentionally generated "bad" samples by doing few steps of MCMC sampling. I am also able to generate CIFARR10 looking samples.

I think your explanation is convincing.

Thank you.

BrotherAmazing t1_ir3dmwz wrote on October 5, 2022 at 1:02 AM

Nearly every data-driven approach to regression and purely discriminative classification has this problem, and it’s a problem of trying to extrapolate far outside the domain that you trained/fit the model in. It’s not about anything else.

Your generated images clearly look nothing like CIFAR-10 training images, so it’s not much different than if I fit two Gaussians to data that was Gaussian in 2-D using samples that all fit within the sphere of radius 1, then I send a 2-D feature measurement into my classifier than is a distance 100 from the origin. Any discriminative classifier that doesn’t have a way to detect outliers/anomalies will likely be extremely confident in classifying this 2-D feature as one of the two classes. We would not say that the classifier has a problem not considering “feature quality”, but would say it’s not very sophisticated.

In the real world in critical problems, CNNs aren’t just fed images like this. Smart engineers have ways to detect if an image is likely not in the training distribution and throw a flag to not have confidence in the CNN’s output.

saw79 t1_ir1a2tb wrote on October 4, 2022 at 4:37 PM

Oh ok cool. Is your code anywhere? What kind of energy model? I have experience with other types of deep generative models but actually am just starting to learn about EBMs myself recently.

ThoughtOk5558 OP t1_ir1au40 wrote on October 4, 2022 at 4:42 PM

https://github.com/wgrathwohl/JEM

I am using this EBM with slight modification (during sampling).

XecutionStyle t1_ir20qoc wrote on October 4, 2022 at 7:24 PM

I don't think it's nebulous. We infuse knowledge, bias, prior etc. like physics (in Lagrangian networks) all the time. I was just addressing his last point. There's no analytical solution for quality we can use as labels.

Networks can understand the difference between pretty and ugly semantically with tons of data, and tons of data only.

saw79 t1_ir23cl5 wrote on October 4, 2022 at 7:40 PM

All I meant by nebulous was that he didn't have a concrete idea for what to actually use as visual quality, and you've nicely described how it's actually a very deep inference that we as humans make with our relatively advanced brains.

I did not mean that it it's conceptually something that can't exist. I think we're very much in agreement.