Cobra / diffusers /docs /source /ko /using-diffusers /controlling_generation.md
JunhaoZhuang's picture
Upload 317 files
edf9d60 verified
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# ์ œ์–ด๋œ ์ƒ์„ฑ
Diffusion ๋ชจ๋ธ์— ์˜ํ•ด ์ƒ์„ฑ๋œ ์ถœ๋ ฅ์„ ์ œ์–ดํ•˜๋Š” ๊ฒƒ์€ ์ปค๋ฎค๋‹ˆํ‹ฐ์—์„œ ์˜ค๋žซ๋™์•ˆ ์ถ”๊ตฌํ•ด ์™”์œผ๋ฉฐ ํ˜„์žฌ ํ™œ๋ฐœํ•œ ์—ฐ๊ตฌ ์ฃผ์ œ์ž…๋‹ˆ๋‹ค. ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” ๋งŽ์€ diffusion ๋ชจ๋ธ์—์„œ๋Š” ์ด๋ฏธ์ง€์™€ ํ…์ŠคํŠธ ํ”„๋กฌํ”„ํŠธ ๋“ฑ ์ž…๋ ฅ์˜ ๋ฏธ๋ฌ˜ํ•œ ๋ณ€ํ™”๋กœ ์ธํ•ด ์ถœ๋ ฅ์ด ํฌ๊ฒŒ ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด์ƒ์ ์ธ ์„ธ๊ณ„์—์„œ๋Š” ์˜๋ฏธ๊ฐ€ ์œ ์ง€๋˜๊ณ  ๋ณ€๊ฒฝ๋˜๋Š” ๋ฐฉ์‹์„ ์ œ์–ดํ•  ์ˆ˜ ์žˆ๊ธฐ๋ฅผ ์›ํ•ฉ๋‹ˆ๋‹ค.
์˜๋ฏธ ๋ณด์กด์˜ ๋Œ€๋ถ€๋ถ„์˜ ์˜ˆ๋Š” ์ž…๋ ฅ์˜ ๋ณ€ํ™”๋ฅผ ์ถœ๋ ฅ์˜ ๋ณ€ํ™”์— ์ •ํ™•ํ•˜๊ฒŒ ๋งคํ•‘ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ์ถ•์†Œ๋ฉ๋‹ˆ๋‹ค. ์ฆ‰, ํ”„๋กฌํ”„ํŠธ์—์„œ ํ”ผ์‚ฌ์ฒด์— ํ˜•์šฉ์‚ฌ๋ฅผ ์ถ”๊ฐ€ํ•˜๋ฉด ์ „์ฒด ์ด๋ฏธ์ง€๊ฐ€ ๋ณด์กด๋˜๊ณ  ๋ณ€๊ฒฝ๋œ ํ”ผ์‚ฌ์ฒด๋งŒ ์ˆ˜์ •๋ฉ๋‹ˆ๋‹ค. ๋˜๋Š” ํŠน์ • ํ”ผ์‚ฌ์ฒด์˜ ์ด๋ฏธ์ง€๋ฅผ ๋ณ€ํ˜•ํ•˜๋ฉด ํ”ผ์‚ฌ์ฒด์˜ ํฌ์ฆˆ๊ฐ€ ์œ ์ง€๋ฉ๋‹ˆ๋‹ค.
์ถ”๊ฐ€์ ์œผ๋กœ ์ƒ์„ฑ๋œ ์ด๋ฏธ์ง€์˜ ํ’ˆ์งˆ์—๋Š” ์˜๋ฏธ ๋ณด์กด ์™ธ์—๋„ ์˜ํ–ฅ์„ ๋ฏธ์น˜๊ณ ์ž ํ•˜๋Š” ํ’ˆ์งˆ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ฆ‰, ์ผ๋ฐ˜์ ์œผ๋กœ ๊ฒฐ๊ณผ๋ฌผ์˜ ํ’ˆ์งˆ์ด ์ข‹๊ฑฐ๋‚˜ ํŠน์ • ์Šคํƒ€์ผ์„ ๊ณ ์ˆ˜ํ•˜๊ฑฐ๋‚˜ ์‚ฌ์‹ค์ ์ด๊ธฐ๋ฅผ ์›ํ•ฉ๋‹ˆ๋‹ค.
diffusion ๋ชจ๋ธ ์ƒ์„ฑ์„ ์ œ์–ดํ•˜๊ธฐ ์œ„ํ•ด `diffusers`๊ฐ€ ์ง€์›ํ•˜๋Š” ๋ช‡ ๊ฐ€์ง€ ๊ธฐ์ˆ ์„ ๋ฌธ์„œํ™”ํ•ฉ๋‹ˆ๋‹ค. ๋งŽ์€ ๋ถ€๋ถ„์ด ์ตœ์ฒจ๋‹จ ์—ฐ๊ตฌ์ด๋ฉฐ ๋ฏธ๋ฌ˜ํ•œ ์ฐจ์ด๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ช…ํ™•ํ•œ ์„ค๋ช…์ด ํ•„์š”ํ•˜๊ฑฐ๋‚˜ ์ œ์•ˆ ์‚ฌํ•ญ์ด ์žˆ์œผ๋ฉด ์ฃผ์ €ํ•˜์ง€ ๋งˆ์‹œ๊ณ  [ํฌ๋Ÿผ](https://discuss.huggingface.co/) ๋˜๋Š” [GitHub ์ด์Šˆ](https://github.com/huggingface/diffusers/issues)์—์„œ ํ† ๋ก ์„ ์‹œ์ž‘ํ•˜์„ธ์š”.
์ƒ์„ฑ ์ œ์–ด ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ๊ฐœ๋žต์ ์ธ ์„ค๋ช…๊ณผ ๊ธฐ์ˆ  ๊ฐœ์š”๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ์ˆ ์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์„ค๋ช…์€ ํŒŒ์ดํ”„๋ผ์ธ์—์„œ ๋งํฌ๋œ ์›๋ณธ ๋…ผ๋ฌธ์„ ์ฐธ์กฐํ•˜๋Š” ๊ฒƒ์ด ๊ฐ€์žฅ ์ข‹์Šต๋‹ˆ๋‹ค.
์‚ฌ์šฉ ์‚ฌ๋ก€์— ๋”ฐ๋ผ ์ ์ ˆํ•œ ๊ธฐ์ˆ ์„ ์„ ํƒํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋งŽ์€ ๊ฒฝ์šฐ ์ด๋Ÿฌํ•œ ๊ธฐ๋ฒ•์„ ๊ฒฐํ•ฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ํ…์ŠคํŠธ ๋ฐ˜์ „๊ณผ SEGA๋ฅผ ๊ฒฐํ•ฉํ•˜์—ฌ ํ…์ŠคํŠธ ๋ฐ˜์ „์„ ์‚ฌ์šฉํ•˜์—ฌ ์ƒ์„ฑ๋œ ์ถœ๋ ฅ์— ๋” ๋งŽ์€ ์˜๋ฏธ์  ์ง€์นจ์„ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋ณ„๋„์˜ ์–ธ๊ธ‰์ด ์—†๋Š” ํ•œ, ์ด๋Ÿฌํ•œ ๊ธฐ๋ฒ•์€ ๊ธฐ์กด ๋ชจ๋ธ๊ณผ ํ•จ๊ป˜ ์ž‘๋™ํ•˜๋ฉฐ ์ž์ฒด ๊ฐ€์ค‘์น˜๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š์€ ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค.
1. [Instruct Pix2Pix](#instruct-pix2pix)
2. [Pix2Pix Zero](#pix2pixzero)
3. [Attend and Excite](#attend-and-excite)
4. [Semantic Guidance](#semantic-guidance)
5. [Self-attention Guidance](#self-attention-guidance)
6. [Depth2Image](#depth2image)
7. [MultiDiffusion Panorama](#multidiffusion-panorama)
8. [DreamBooth](#dreambooth)
9. [Textual Inversion](#textual-inversion)
10. [ControlNet](#controlnet)
11. [Prompt Weighting](#prompt-weighting)
12. [Custom Diffusion](#custom-diffusion)
13. [Model Editing](#model-editing)
14. [DiffEdit](#diffedit)
15. [T2I-Adapter](#t2i-adapter)
ํŽธ์˜๋ฅผ ์œ„ํ•ด, ์ถ”๋ก ๋งŒ ํ•˜๊ฑฐ๋‚˜ ํŒŒ์ธํŠœ๋‹/ํ•™์Šตํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ํ‘œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
| **Method** | **Inference only** | **Requires training /<br> fine-tuning** | **Comments** |
| :-------------------------------------------------: | :----------------: | :-------------------------------------: | :---------------------------------------------------------------------------------------------: |
| [Instruct Pix2Pix](#instruct-pix2pix) | โœ… | โŒ | Can additionally be<br>fine-tuned for better <br>performance on specific <br>edit instructions. |
| [Pix2Pix Zero](#pix2pixzero) | โœ… | โŒ | |
| [Attend and Excite](#attend-and-excite) | โœ… | โŒ | |
| [Semantic Guidance](#semantic-guidance) | โœ… | โŒ | |
| [Self-attention Guidance](#self-attention-guidance) | โœ… | โŒ | |
| [Depth2Image](#depth2image) | โœ… | โŒ | |
| [MultiDiffusion Panorama](#multidiffusion-panorama) | โœ… | โŒ | |
| [DreamBooth](#dreambooth) | โŒ | โœ… | |
| [Textual Inversion](#textual-inversion) | โŒ | โœ… | |
| [ControlNet](#controlnet) | โœ… | โŒ | A ControlNet can be <br>trained/fine-tuned on<br>a custom conditioning. |
| [Prompt Weighting](#prompt-weighting) | โœ… | โŒ | |
| [Custom Diffusion](#custom-diffusion) | โŒ | โœ… | |
| [Model Editing](#model-editing) | โœ… | โŒ | |
| [DiffEdit](#diffedit) | โœ… | โŒ | |
| [T2I-Adapter](#t2i-adapter) | โœ… | โŒ | |
## Pix2Pix Instruct
[Paper](https://arxiv.org/abs/2211.09800)
[Instruct Pix2Pix](../api/pipelines/stable_diffusion/pix2pix) ๋Š” ์ž…๋ ฅ ์ด๋ฏธ์ง€ ํŽธ์ง‘์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด stable diffusion์—์„œ ๋ฏธ์„ธ-์กฐ์ •๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€์™€ ํŽธ์ง‘์„ ์„ค๋ช…ํ•˜๋Š” ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ ํŽธ์ง‘๋œ ์ด๋ฏธ์ง€๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
Instruct Pix2Pix๋Š” [InstructGPT](https://openai.com/blog/instruction-following/)์™€ ๊ฐ™์€ ํ”„๋กฌํ”„ํŠธ์™€ ์ž˜ ์ž‘๋™ํ•˜๋„๋ก ๋ช…์‹œ์ ์œผ๋กœ ํ›ˆ๋ จ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
์‚ฌ์šฉ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [์—ฌ๊ธฐ](../api/pipelines/stable_diffusion/pix2pix)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
## Pix2Pix Zero
[Paper](https://arxiv.org/abs/2302.03027)
[Pix2Pix Zero](../api/pipelines/stable_diffusion/pix2pix_zero)๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์ผ๋ฐ˜์ ์ธ ์ด๋ฏธ์ง€ ์˜๋ฏธ๋ฅผ ์œ ์ง€ํ•˜๋ฉด์„œ ํ•œ ๊ฐœ๋…์ด๋‚˜ ํ”ผ์‚ฌ์ฒด๊ฐ€ ๋‹ค๋ฅธ ๊ฐœ๋…์ด๋‚˜ ํ”ผ์‚ฌ์ฒด๋กœ ๋ณ€ํ™˜๋˜๋„๋ก ์ด๋ฏธ์ง€๋ฅผ ์ˆ˜์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ํ”„๋กœ์„ธ์Šค๋Š” ํ•œ ๊ฐœ๋…์  ์ž„๋ฒ ๋”ฉ์—์„œ ๋‹ค๋ฅธ ๊ฐœ๋…์  ์ž„๋ฒ ๋”ฉ์œผ๋กœ ์•ˆ๋‚ด๋ฉ๋‹ˆ๋‹ค. ์ค‘๊ฐ„ ์ž ๋ณต(intermediate latents)์€ ๋””๋…ธ์ด์ง•(denoising?) ํ”„๋กœ์„ธ์Šค ์ค‘์— ์ตœ์ ํ™”๋˜์–ด ์ฐธ์กฐ ์ฃผ์˜ ์ง€๋„(reference attention maps)๋ฅผ ํ–ฅํ•ด ๋‚˜์•„๊ฐ‘๋‹ˆ๋‹ค. ์ฐธ์กฐ ์ฃผ์˜ ์ง€๋„(reference attention maps)๋Š” ์ž…๋ ฅ ์ด๋ฏธ์ง€์˜ ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ(?) ํ”„๋กœ์„ธ์Šค์—์„œ ๋‚˜์˜จ ๊ฒƒ์œผ๋กœ ์˜๋ฏธ ๋ณด์กด์„ ์žฅ๋ คํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
Pix2Pix Zero๋Š” ํ•ฉ์„ฑ ์ด๋ฏธ์ง€์™€ ์‹ค์ œ ์ด๋ฏธ์ง€๋ฅผ ํŽธ์ง‘ํ•˜๋Š” ๋ฐ ๋ชจ๋‘ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
- ํ•ฉ์„ฑ ์ด๋ฏธ์ง€๋ฅผ ํŽธ์ง‘ํ•˜๋ ค๋ฉด ๋จผ์ € ์บก์…˜์ด ์ง€์ •๋œ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
๋‹ค์Œ์œผ๋กœ ํŽธ์ง‘ํ•  ์ปจ์…‰๊ณผ ์ƒˆ๋กœ์šด ํƒ€๊ฒŸ ์ปจ์…‰์— ๋Œ€ํ•œ ์ด๋ฏธ์ง€ ์บก์…˜์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด [Flan-T5](https://huggingface.co/docs/transformers/model_doc/flan-t5)์™€ ๊ฐ™์€ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ ํ…์ŠคํŠธ ์ธ์ฝ”๋”๋ฅผ ํ†ตํ•ด ์†Œ์Šค ๊ฐœ๋…๊ณผ ๋Œ€์ƒ ๊ฐœ๋… ๋ชจ๋‘์— ๋Œ€ํ•œ "ํ‰๊ท " ํ”„๋กฌํ”„ํŠธ ์ž„๋ฒ ๋”ฉ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ํ•ฉ์„ฑ ์ด๋ฏธ์ง€๋ฅผ ํŽธ์ง‘ํ•˜๊ธฐ ์œ„ํ•ด pix2pix-zero ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
- ์‹ค์ œ ์ด๋ฏธ์ง€๋ฅผ ํŽธ์ง‘ํ•˜๋ ค๋ฉด ๋จผ์ € [BLIP](https://huggingface.co/docs/transformers/model_doc/blip)๊ณผ ๊ฐ™์€ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€ ์บก์…˜์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ ํ”„๋กฌํ”„ํŠธ์™€ ์ด๋ฏธ์ง€์— ddim ๋ฐ˜์ „์„ ์ ์šฉํ•˜์—ฌ "์—ญ(inverse)" latents์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ด์ „๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ์†Œ์Šค ๋ฐ ๋Œ€์ƒ ๊ฐœ๋… ๋ชจ๋‘์— ๋Œ€ํ•œ "ํ‰๊ท (mean)" ํ”„๋กฌํ”„ํŠธ ์ž„๋ฒ ๋”ฉ์ด ์ƒ์„ฑ๋˜๊ณ  ๋งˆ์ง€๋ง‰์œผ๋กœ "์—ญ(inverse)" latents์™€ ๊ฒฐํ•ฉ๋œ pix2pix-zero ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์ด๋ฏธ์ง€๋ฅผ ํŽธ์ง‘ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
<Tip>
Pix2Pix Zero๋Š” '์ œ๋กœ ์ƒท(zero-shot)' ์ด๋ฏธ์ง€ ํŽธ์ง‘์ด ๊ฐ€๋Šฅํ•œ ์ตœ์ดˆ์˜ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
์ฆ‰, ์ด ๋ชจ๋ธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ๋ฐ˜ ์†Œ๋น„์ž์šฉ GPU์—์„œ 1๋ถ„ ์ด๋‚ด์— ์ด๋ฏธ์ง€๋ฅผ ํŽธ์ง‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค(../api/pipelines/stable_diffusion/pix2pix_zero#usage-example).
</Tip>
์œ„์—์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด Pix2Pix Zero์—๋Š” ํŠน์ • ๊ฐœ๋…์œผ๋กœ ์„ธ๋Œ€๋ฅผ ์œ ๋„ํ•˜๊ธฐ ์œ„ํ•ด (UNet, VAE ๋˜๋Š” ํ…์ŠคํŠธ ์ธ์ฝ”๋”๊ฐ€ ์•„๋‹Œ) latents์„ ์ตœ์ ํ™”ํ•˜๋Š” ๊ธฐ๋Šฅ์ด ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.์ฆ‰, ์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ์— ํ‘œ์ค€ [StableDiffusionPipeline](../api/pipelines/stable_diffusion/text2img)๋ณด๋‹ค ๋” ๋งŽ์€ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์‚ฌ์šฉ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [์—ฌ๊ธฐ](../api/pipelines/stable_diffusion/pix2pix_zero)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
## Attend and Excite
[Paper](https://arxiv.org/abs/2301.13826)
[Attend and Excite](../api/pipelines/stable_diffusion/attend_and_excite)๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ํ”„๋กฌํ”„ํŠธ์˜ ํ”ผ์‚ฌ์ฒด๊ฐ€ ์ตœ์ข… ์ด๋ฏธ์ง€์— ์ถฉ์‹คํ•˜๊ฒŒ ํ‘œํ˜„๋˜๋„๋ก ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ด๋ฏธ์ง€์— ์กด์žฌํ•ด์•ผ ํ•˜๋Š” ํ”„๋กฌํ”„ํŠธ์˜ ํ”ผ์‚ฌ์ฒด์— ํ•ด๋‹นํ•˜๋Š” ์ผ๋ จ์˜ ํ† ํฐ ์ธ๋ฑ์Šค๊ฐ€ ์ž…๋ ฅ์œผ๋กœ ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค. ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ์ค‘์— ๊ฐ ํ† ํฐ ์ธ๋ฑ์Šค๋Š” ์ด๋ฏธ์ง€์˜ ์ตœ์†Œ ํ•œ ํŒจ์น˜ ์ด์ƒ์— ๋Œ€ํ•ด ์ตœ์†Œ ์ฃผ์˜ ์ž„๊ณ„๊ฐ’์„ ๊ฐ–๋„๋ก ๋ณด์žฅ๋ฉ๋‹ˆ๋‹ค. ๋ชจ๋“  ํ”ผ์‚ฌ์ฒด ํ† ํฐ์— ๋Œ€ํ•ด ์ฃผ์˜ ์ž„๊ณ„๊ฐ’์ด ํ†ต๊ณผ๋  ๋•Œ๊นŒ์ง€ ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ํ”„๋กœ์„ธ์Šค ์ค‘์— ์ค‘๊ฐ„ ์ž ๋ณต๊ธฐ๊ฐ€ ๋ฐ˜๋ณต์ ์œผ๋กœ ์ตœ์ ํ™”๋˜์–ด ๊ฐ€์žฅ ์†Œํ™€ํžˆ ์ทจ๊ธ‰๋˜๋Š” ํ”ผ์‚ฌ์ฒด ํ† ํฐ์˜ ์ฃผ์˜๋ ฅ์„ ๊ฐ•ํ™”ํ•ฉ๋‹ˆ๋‹ค.
Pix2Pix Zero์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ Attend and Excite ์—ญ์‹œ ํŒŒ์ดํ”„๋ผ์ธ์— ๋ฏธ๋‹ˆ ์ตœ์ ํ™” ๋ฃจํ”„(์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜๋ฅผ ๊ทธ๋Œ€๋กœ ๋‘” ์ฑ„)๊ฐ€ ํฌํ•จ๋˜๋ฉฐ, ์ผ๋ฐ˜์ ์ธ 'StableDiffusionPipeline'๋ณด๋‹ค ๋” ๋งŽ์€ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์‚ฌ์šฉ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [์—ฌ๊ธฐ](../api/pipelines/stable_diffusion/attend_and_excite)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
## Semantic Guidance (SEGA)
[Paper](https://arxiv.org/abs/2301.12247)
์˜๋ฏธ์œ ๋„(SEGA)๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์ด๋ฏธ์ง€์—์„œ ํ•˜๋‚˜ ์ด์ƒ์˜ ์ปจ์…‰์„ ์ ์šฉํ•˜๊ฑฐ๋‚˜ ์ œ๊ฑฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ปจ์…‰์˜ ๊ฐ•๋„๋„ ์กฐ์ ˆํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฆ‰, ์Šค๋งˆ์ผ ์ปจ์…‰์„ ์‚ฌ์šฉํ•˜์—ฌ ์ธ๋ฌผ ์‚ฌ์ง„์˜ ์Šค๋งˆ์ผ์„ ์ ์ง„์ ์œผ๋กœ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋ถ„๋ฅ˜๊ธฐ ๋ฌด๋ฃŒ ์•ˆ๋‚ด(classifier free guidance)๊ฐ€ ๋นˆ ํ”„๋กฌํ”„ํŠธ ์ž…๋ ฅ์„ ํ†ตํ•ด ์•ˆ๋‚ด๋ฅผ ์ œ๊ณตํ•˜๋Š” ๋ฐฉ์‹๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ, SEGA๋Š” ๊ฐœ๋… ํ”„๋กฌํ”„ํŠธ์— ๋Œ€ํ•œ ์•ˆ๋‚ด๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฐœ๋… ํ”„๋กฌํ”„ํŠธ๋Š” ์—ฌ๋Ÿฌ ๊ฐœ๋ฅผ ๋™์‹œ์— ์ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐ ๊ฐœ๋… ํ”„๋กฌํ”„ํŠธ๋Š” ์•ˆ๋‚ด๊ฐ€ ๊ธ์ •์ ์œผ๋กœ ์ ์šฉ๋˜๋Š”์ง€ ๋˜๋Š” ๋ถ€์ •์ ์œผ๋กœ ์ ์šฉ๋˜๋Š”์ง€์— ๋”ฐ๋ผ ํ•ด๋‹น ๊ฐœ๋…์„ ์ถ”๊ฐ€ํ•˜๊ฑฐ๋‚˜ ์ œ๊ฑฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
Pix2Pix Zero ๋˜๋Š” Attend and Excite์™€ ๋‹ฌ๋ฆฌ SEGA๋Š” ๋ช…์‹œ์ ์ธ ๊ทธ๋ผ๋ฐ์ด์…˜ ๊ธฐ๋ฐ˜ ์ตœ์ ํ™”๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ๋Œ€์‹  ํ™•์‚ฐ ํ”„๋กœ์„ธ์Šค์™€ ์ง์ ‘ ์ƒํ˜ธ ์ž‘์šฉํ•ฉ๋‹ˆ๋‹ค.
์‚ฌ์šฉ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [์—ฌ๊ธฐ](../api/pipelines/semantic_stable_diffusion)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
## Self-attention Guidance (SAG)
[Paper](https://arxiv.org/abs/2210.00939)
[์ž๊ธฐ ์ฃผ์˜ ์•ˆ๋‚ด](../api/pipelines/stable_diffusion/self_attention_guidance)๋Š” ์ด๋ฏธ์ง€์˜ ์ „๋ฐ˜์ ์ธ ํ’ˆ์งˆ์„ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค.
SAG๋Š” ๊ณ ๋นˆ๋„ ์„ธ๋ถ€ ์ •๋ณด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜์ง€ ์•Š์€ ์˜ˆ์ธก์—์„œ ์™„์ „ํžˆ ์กฐ๊ฑดํ™”๋œ ์ด๋ฏธ์ง€์— ์ด๋ฅด๊ธฐ๊นŒ์ง€ ๊ฐ€์ด๋“œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๊ณ ๋นˆ๋„ ๋””ํ…Œ์ผ์€ UNet ์ž๊ธฐ ์ฃผ์˜ ๋งต์—์„œ ์ถ”์ถœ๋ฉ๋‹ˆ๋‹ค.
์‚ฌ์šฉ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [์—ฌ๊ธฐ](../api/pipelines/stable_diffusion/self_attention_guidance)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
## Depth2Image
[Project](https://huggingface.co/stabilityai/stable-diffusion-2-depth)
[Depth2Image](../pipelines/stable_diffusion_2#depthtoimage)๋Š” ํ…์ŠคํŠธ ์•ˆ๋‚ด ์ด๋ฏธ์ง€ ๋ณ€ํ™”์— ๋Œ€ํ•œ ์‹œ๋งจํ‹ฑ์„ ๋” ์ž˜ ๋ณด์กดํ•˜๋„๋ก ์•ˆ์ •์  ํ™•์‚ฐ์—์„œ ๋ฏธ์„ธ ์กฐ์ •๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
์›๋ณธ ์ด๋ฏธ์ง€์˜ ๋‹จ์•ˆ(monocular) ๊นŠ์ด ์ถ”์ •์น˜๋ฅผ ์กฐ๊ฑด์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค.
์‚ฌ์šฉ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [์—ฌ๊ธฐ](../api/pipelines/stable_diffusion_2#depthtoimage)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
<Tip>
InstructPix2Pix์™€ Pix2Pix Zero์™€ ๊ฐ™์€ ๋ฐฉ๋ฒ•์˜ ์ค‘์š”ํ•œ ์ฐจ์ด์ ์€ ์ „์ž์˜ ๊ฒฝ์šฐ
๋Š” ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜๋ฅผ ๋ฏธ์„ธ ์กฐ์ •ํ•˜๋Š” ๋ฐ˜๋ฉด, ํ›„์ž๋Š” ๊ทธ๋ ‡์ง€ ์•Š๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ฆ‰, ๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋“  ์•ˆ์ •์  ํ™•์‚ฐ ๋ชจ๋ธ์— Pix2Pix Zero๋ฅผ ์ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
</Tip>
## MultiDiffusion Panorama
[Paper](https://arxiv.org/abs/2302.08113)
MultiDiffusion์€ ์‚ฌ์ „ ํ•™์Šต๋œ diffusion model์„ ํ†ตํ•ด ์ƒˆ๋กœ์šด ์ƒ์„ฑ ํ”„๋กœ์„ธ์Šค๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. ์ด ํ”„๋กœ์„ธ์Šค๋Š” ๊ณ ํ’ˆ์งˆ์˜ ๋‹ค์–‘ํ•œ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์‰ฝ๊ฒŒ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์—ฌ๋Ÿฌ diffusion ์ƒ์„ฑ ๋ฐฉ๋ฒ•์„ ํ•˜๋‚˜๋กœ ๋ฌถ์Šต๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ๋Š” ์›ํ•˜๋Š” ์ข…ํšก๋น„(์˜ˆ: ํŒŒ๋…ธ๋ผ๋งˆ) ๋ฐ ํƒ€์ดํŠธํ•œ ๋ถ„ํ•  ๋งˆ์Šคํฌ์—์„œ ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค์— ์ด๋ฅด๋Š” ๊ณต๊ฐ„ ์•ˆ๋‚ด ์‹ ํ˜ธ์™€ ๊ฐ™์€ ์‚ฌ์šฉ์ž๊ฐ€ ์ œ๊ณตํ•œ ์ œ์–ด๋ฅผ ์ค€์ˆ˜ํ•ฉ๋‹ˆ๋‹ค.
[MultiDiffusion ํŒŒ๋…ธ๋ผ๋งˆ](../api/pipelines/stable_diffusion/panorama)๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์ž„์˜์˜ ์ข…ํšก๋น„(์˜ˆ: ํŒŒ๋…ธ๋ผ๋งˆ)๋กœ ๊ณ ํ’ˆ์งˆ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ํŒŒ๋…ธ๋ผ๋งˆ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [์—ฌ๊ธฐ](../api/pipelines/stable_diffusion/panorama)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
## ๋‚˜๋งŒ์˜ ๋ชจ๋ธ ํŒŒ์ธํŠœ๋‹
์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ ์™ธ์—๋„ Diffusers๋Š” ์‚ฌ์šฉ์ž๊ฐ€ ์ œ๊ณตํ•œ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•  ์ˆ˜ ์žˆ๋Š” ํ•™์Šต ์Šคํฌ๋ฆฝํŠธ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
## DreamBooth
[DreamBooth](../training/dreambooth)๋Š” ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜์—ฌ ์ƒˆ๋กœ์šด ์ฃผ์ œ์— ๋Œ€ํ•ด ๊ฐ€๋ฅด์นฉ๋‹ˆ๋‹ค. ์ฆ‰, ํ•œ ์‚ฌ๋žŒ์˜ ์‚ฌ์ง„ ๋ช‡ ์žฅ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์Šคํƒ€์ผ๋กœ ๊ทธ ์‚ฌ๋žŒ์˜ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์‚ฌ์šฉ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [์—ฌ๊ธฐ](../training/dreambooth)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
## Textual Inversion
[Textual Inversion](../training/text_inversion)์€ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜์—ฌ ์ƒˆ๋กœ์šด ๊ฐœ๋…์— ๋Œ€ํ•ด ํ•™์Šต์‹œํ‚ต๋‹ˆ๋‹ค. ์ฆ‰, ํŠน์ • ์Šคํƒ€์ผ์˜ ์•„ํŠธ์› ์‚ฌ์ง„ ๋ช‡ ์žฅ์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•ด๋‹น ์Šคํƒ€์ผ์˜ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์‚ฌ์šฉ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [์—ฌ๊ธฐ](../training/text_inversion)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
## ControlNet
[Paper](https://arxiv.org/abs/2302.05543)
[ControlNet](../api/pipelines/stable_diffusion/controlnet)์€ ์ถ”๊ฐ€ ์กฐ๊ฑด์„ ์ถ”๊ฐ€ํ•˜๋Š” ๋ณด์กฐ ๋„คํŠธ์›Œํฌ์ž…๋‹ˆ๋‹ค.
๊ฐ€์žฅ์ž๋ฆฌ ๊ฐ์ง€, ๋‚™์„œ, ๊นŠ์ด ๋งต, ์˜๋ฏธ์  ์„ธ๊ทธ๋จผํŠธ์™€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์กฐ๊ฑด์— ๋Œ€ํ•ด ํ›ˆ๋ จ๋œ 8๊ฐœ์˜ ํ‘œ์ค€ ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ControlNet์ด ์žˆ์Šต๋‹ˆ๋‹ค,
๊นŠ์ด ๋งต, ์‹œ๋งจํ‹ฑ ์„ธ๊ทธ๋จผํ…Œ์ด์…˜๊ณผ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์กฐ๊ฑด์œผ๋กœ ํ›ˆ๋ จ๋œ 8๊ฐœ์˜ ํ‘œ์ค€ ์ œ์–ด๋ง์ด ์žˆ์Šต๋‹ˆ๋‹ค.
์‚ฌ์šฉ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [์—ฌ๊ธฐ](../api/pipelines/stable_diffusion/controlnet)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
## Prompt Weighting
ํ”„๋กฌํ”„ํŠธ ๊ฐ€์ค‘์น˜๋Š” ํ…์ŠคํŠธ์˜ ํŠน์ • ๋ถ€๋ถ„์— ๋” ๋งŽ์€ ๊ด€์‹ฌ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•˜๋Š” ๊ฐ„๋‹จํ•œ ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค.
์ž…๋ ฅ์— ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•˜๋Š” ๊ฐ„๋‹จํ•œ ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค.
์ž์„ธํ•œ ์„ค๋ช…๊ณผ ์˜ˆ์‹œ๋Š” [์—ฌ๊ธฐ](../using-diffusers/weighted_prompts)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
## Custom Diffusion
[Custom Diffusion](../training/custom_diffusion)์€ ์‚ฌ์ „ ํ•™์Šต๋œ text-to-image ๊ฐ„ ํ™•์‚ฐ ๋ชจ๋ธ์˜ ๊ต์ฐจ ๊ด€์‹ฌ๋„ ๋งต๋งŒ ๋ฏธ์„ธ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
๋˜ํ•œ textual inversion์„ ์ถ”๊ฐ€๋กœ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์„ค๊ณ„์ƒ ๋‹ค์ค‘ ๊ฐœ๋… ํ›ˆ๋ จ์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
DreamBooth ๋ฐ Textual Inversion ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ, ์‚ฌ์šฉ์ž ์ง€์ • ํ™•์‚ฐ์€ ์‚ฌ์ „ํ•™์Šต๋œ text-to-image diffusion ๋ชจ๋ธ์— ์ƒˆ๋กœ์šด ๊ฐœ๋…์„ ํ•™์Šต์‹œ์ผœ ๊ด€์‹ฌ ์žˆ๋Š” ๊ฐœ๋…๊ณผ ๊ด€๋ จ๋œ ์ถœ๋ ฅ์„ ์ƒ์„ฑํ•˜๋Š” ๋ฐ์—๋„ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
์ž์„ธํ•œ ์„ค๋ช…์€ [๊ณต์‹ ๋ฌธ์„œ](../training/custom_diffusion)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
## Model Editing
[Paper](https://arxiv.org/abs/2303.08084)
[ํ…์ŠคํŠธ-์ด๋ฏธ์ง€ ๋ชจ๋ธ ํŽธ์ง‘ ํŒŒ์ดํ”„๋ผ์ธ](../api/pipelines/model_editing)์„ ์‚ฌ์šฉํ•˜๋ฉด ์‚ฌ์ „ํ•™์Šต๋œ text-to-image diffusion ๋ชจ๋ธ์ด ์ž…๋ ฅ ํ”„๋กฌํ”„ํŠธ์— ์žˆ๋Š” ํ”ผ์‚ฌ์ฒด์— ๋Œ€ํ•ด ๋‚ด๋ฆด ์ˆ˜ ์žˆ๋Š” ์ž˜๋ชป๋œ ์•”์‹œ์  ๊ฐ€์ •์„ ์™„ํ™”ํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค.
์˜ˆ๋ฅผ ๋“ค์–ด, ์•ˆ์ •์  ํ™•์‚ฐ์— "A pack of roses"์— ๋Œ€ํ•œ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋ผ๋Š” ๋ฉ”์‹œ์ง€๋ฅผ ํ‘œ์‹œํ•˜๋ฉด ์ƒ์„ฑ๋œ ์ด๋ฏธ์ง€์˜ ์žฅ๋ฏธ๋Š” ๋นจ๊ฐ„์ƒ‰์ผ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์Šต๋‹ˆ๋‹ค. ์ด ํŒŒ์ดํ”„๋ผ์ธ์€ ์ด๋Ÿฌํ•œ ๊ฐ€์ •์„ ๋ณ€๊ฒฝํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค.
์ž์„ธํ•œ ์„ค๋ช…์€ [๊ณต์‹ ๋ฌธ์„œ](../api/pipelines/model_editing)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
## DiffEdit
[Paper](https://arxiv.org/abs/2210.11427)
[DiffEdit](../api/pipelines/diffedit)๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์›๋ณธ ์ž…๋ ฅ ์ด๋ฏธ์ง€๋ฅผ ์ตœ๋Œ€ํ•œ ๋ณด์กดํ•˜๋ฉด์„œ ์ž…๋ ฅ ํ”„๋กฌํ”„ํŠธ์™€ ํ•จ๊ป˜ ์ž…๋ ฅ ์ด๋ฏธ์ง€์˜ ์˜๋ฏธ๋ก ์  ํŽธ์ง‘์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
์ž์„ธํ•œ ์„ค๋ช…์€ [๊ณต์‹ ๋ฌธ์„œ](../api/pipelines/diffedit)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
## T2I-Adapter
[Paper](https://arxiv.org/abs/2302.08453)
[T2I-์–ด๋Œ‘ํ„ฐ](../api/pipelines/stable_diffusion/adapter)๋Š” ์ถ”๊ฐ€์ ์ธ ์กฐ๊ฑด์„ ์ถ”๊ฐ€ํ•˜๋Š” auxiliary ๋„คํŠธ์›Œํฌ์ž…๋‹ˆ๋‹ค.
๊ฐ€์žฅ์ž๋ฆฌ ๊ฐ์ง€, ์Šค์ผ€์น˜, depth maps, semantic segmentations์™€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์กฐ๊ฑด์— ๋Œ€ํ•ด ํ›ˆ๋ จ๋œ 8๊ฐœ์˜ ํ‘œ์ค€ ์‚ฌ์ „ํ›ˆ๋ จ๋œ adapter๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค,
[๊ณต์‹ ๋ฌธ์„œ](api/pipelines/stable_diffusion/adapter)์—์„œ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.