Submitted by zergling103 t3_y49i4s in MachineLearning
Diffusion models are trained on a image sequences, where in each sequence, the image is progressively corrupted with noise; given image N, add noise to produce N+1.
The diffusion model learns to reverse the corruption by one step; given image N, predict N-1.
Could other forms of corruption be used instead of uniform noise? Examples:
- Compression artifacts
- Perlin noise
- Uniform noise applied inconsistently across the image
- Bad camera exposure
- Banding due to low bit depth
- Gaussian blur
- Pixelation, aliasing, or other sampling artifacts
- Motion blur
- Color transformations
- Sequences of corruptions where the choice of degradation is different for each step
Or more complex examples, perhaps for training a model to change the semantics of an image, or repair incoherent outputs:
- Patches from other images
- Images with incorrect labels blended in
- Scrambling image patches with random transformations
- Sequences of outputs from GANs produced as its training progressed (but with the same seed)
- DeepDream iterations
Or, in general, any distortion with these properties:
- Cheap to produce or assemble sequences for
- Causes the image to become more out-of-distribution from the uncorrupted image dataset for the given prompt
The motivation for asking is, if there isn't something "special" about noise and any drop-in corruption could be used, a diffusion model could:
- Be used as a blind image restoration; make an image "better" by human measure without changing it significantly.
- Tweak the content of an image without removing unecessary details with noise; make an image match a prompt with minimal changes.
If there is something "special" about noise (e.g. the model or training procedure makes certain assumptions that depend on noise), what is special about noise, and how can diffusion models be modified to handle more general corruptions?
Thanks!
feliximo t1_iscxjm1 wrote
Already a paper on this, they call it Cold Diffusion.