bloc97 t1_ivpzu4j wrote on November 9, 2022 at 7:38 PM

Theoretically, the upper bound of distinct images is proportional to the number of bits required to encode each latent, thus a 64x64x4 latent encoded as a 32-bit number would amount to (2^32)^(64x64x4) images. However, many of those combinations are not considered to be "images" (they are "out of distribution"), thus the real number might be much much smaller than this, depending on the dataset and the network size.

[deleted] OP t1_ivqfqvz wrote on November 9, 2022 at 9:19 PM

[deleted]

bloc97 t1_ivqgf0q wrote on November 9, 2022 at 9:24 PM

I was considering an unconditional latent diffusion model, but for conditional models, the computation becomes much more complex (we might have to use bayes here). If we use Score-Based Generative Modeling (https://arxiv.org/abs/2011.13456), we could try to find and count all the unique local minima and saddle points, but it is not clear how we can do this...

Professional-Ebb4970 t1_ivr476b wrote on November 10, 2022 at 12:08 AM

You don't need to use a single seed for the noise patch, you can use random numbers and it will work just fine