spring_m t1_irq31mn wrote on October 10, 2022 at 4:17 AM

Another important skill is being able to understand what is truly crucial to a solution versus what is nice to have. For example you mention VAEs which are used to encode the images in latent space for stable diffusion etc. Based on some experiments I've done you don't actually need a VAE - a vanilla convolutional auto-encoder is good enough since you don't actually sample from the latents - the diffusion takes care of that.