Spaces:
Runtime error
Runtime error
## Stable Diffusion Pipeline | |
This is probably the best end-to-end semi-technical article: | |
<https://stable-diffusion-art.com/how-stable-diffusion-work/> | |
And a detailed look at diffusion process: | |
<https://towardsdatascience.com/understanding-diffusion-probabilistic-models-dpms-1940329d6048> | |
But this is a short look at the pipeline: | |
1. Encoder / Conditioning | |
Text (via tokenizer) or image (via vision model) to semantic map | |
(e.g CLiP text encoder) | |
2. Sampler | |
Generate noise which is starting point to map to content | |
(e.g. k_lms) | |
3. Diffuser | |
Create vector content based on resolved noise + semantic map | |
(e.g. actual stable diffusion checkpoint) | |
4. Autoencoder | |
Maps between latent and pixel space (actually creates images from vectors) | |
(e.g. typically some image-database trained GAN) | |
5. Denoising | |
Get meaningful images from pixel signatures | |
Basically, blends what autoencoder inserted using information from diffuser | |
(e.g. U-NET) | |
6. Loop and repeat | |
From step#3 with cross-attention to blend results | |
7. Run additional models as needed | |
- Upscale (e.g. ESRGAN) | |
- Resore Face (e.g. GFPGAN or CodeFormer) | |