ahassoun's picture
Upload 3018 files
ee6e328
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# πŸ€— PEFT둜 μ–΄λŒ‘ν„° κ°€μ Έμ˜€κΈ° [[load-adapters-with-peft]]
[[open-in-colab]]
[Parameter-Efficient Fine Tuning (PEFT)](https://huggingface.co/blog/peft) 방법은 μ‚¬μ „ν›ˆλ ¨λœ λͺ¨λΈμ˜ λ§€κ°œλ³€μˆ˜λ₯Ό λ―Έμ„Έ μ‘°μ • 쀑 κ³ μ •μ‹œν‚€κ³ , κ·Έ μœ„μ— ν›ˆλ ¨ν•  수 μžˆλŠ” 맀우 적은 수의 λ§€κ°œλ³€μˆ˜(μ–΄λŒ‘ν„°)λ₯Ό μΆ”κ°€ν•©λ‹ˆλ‹€. μ–΄λŒ‘ν„°λŠ” μž‘μ—…λ³„ 정보λ₯Ό ν•™μŠ΅ν•˜λ„λ‘ ν›ˆλ ¨λ©λ‹ˆλ‹€. 이 μ ‘κ·Ό 방식은 μ™„μ „νžˆ λ―Έμ„Έ μ‘°μ •λœ λͺ¨λΈμ— ν•„μ ν•˜λŠ” κ²°κ³Όλ₯Ό μƒμ„±ν•˜λ©΄μ„œ, λ©”λͺ¨λ¦¬ 효율적이고 비ꡐ적 적은 μ»΄ν“¨νŒ… λ¦¬μ†ŒμŠ€λ₯Ό μ‚¬μš©ν•©λ‹ˆλ‹€.
λ˜ν•œ PEFT둜 ν›ˆλ ¨λœ μ–΄λŒ‘ν„°λŠ” 일반적으둜 전체 λͺ¨λΈλ³΄λ‹€ 훨씬 μž‘κΈ° λ•Œλ¬Έμ— 곡유, μ €μž₯ 및 κ°€μ Έμ˜€κΈ°κ°€ νŽΈλ¦¬ν•©λ‹ˆλ‹€.
<div class="flex flex-col justify-center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/peft/PEFT-hub-screenshot.png"/>
<figcaption class="text-center">Hub에 μ €μž₯된 OPTForCausalLM λͺ¨λΈμ˜ μ–΄λŒ‘ν„° κ°€μ€‘μΉ˜λŠ” μ΅œλŒ€ 700MB에 λ‹¬ν•˜λŠ” λͺ¨λΈ κ°€μ€‘μΉ˜μ˜ 전체 크기에 λΉ„ν•΄ μ•½ 6MB에 λΆˆκ³Όν•©λ‹ˆλ‹€.</figcaption>
</div>
πŸ€— PEFT λΌμ΄λΈŒλŸ¬λ¦¬μ— λŒ€ν•΄ μžμ„Ένžˆ μ•Œμ•„λ³΄λ €λ©΄ [λ¬Έμ„œ](https://huggingface.co/docs/peft/index)λ₯Ό ν™•μΈν•˜μ„Έμš”.
## μ„€μ • [[setup]]
πŸ€— PEFTλ₯Ό μ„€μΉ˜ν•˜μ—¬ μ‹œμž‘ν•˜μ„Έμš”:
```bash
pip install peft
```
μƒˆλ‘œμš΄ κΈ°λŠ₯을 μ‚¬μš©ν•΄λ³΄κ³  μ‹Άλ‹€λ©΄, λ‹€μŒ μ†ŒμŠ€μ—μ„œ 라이브러리λ₯Ό μ„€μΉ˜ν•˜λŠ” 것이 μ’‹μŠ΅λ‹ˆλ‹€:
```bash
pip install git+https://github.com/huggingface/peft.git
```
## μ§€μ›λ˜λŠ” PEFT λͺ¨λΈ [[supported-peft-models]]
πŸ€— TransformersλŠ” 기본적으둜 일뢀 PEFT 방법을 μ§€μ›ν•˜λ©°, λ‘œμ»¬μ΄λ‚˜ Hub에 μ €μž₯된 μ–΄λŒ‘ν„° κ°€μ€‘μΉ˜λ₯Ό κ°€μ Έμ˜€κ³  λͺ‡ μ€„μ˜ μ½”λ“œλ§ŒμœΌλ‘œ μ‰½κ²Œ μ‹€ν–‰ν•˜κ±°λ‚˜ ν›ˆλ ¨ν•  수 μžˆμŠ΅λ‹ˆλ‹€. λ‹€μŒ 방법을 μ§€μ›ν•©λ‹ˆλ‹€:
- [Low Rank Adapters](https://huggingface.co/docs/peft/conceptual_guides/lora)
- [IA3](https://huggingface.co/docs/peft/conceptual_guides/ia3)
- [AdaLoRA](https://arxiv.org/abs/2303.10512)
πŸ€— PEFT와 κ΄€λ ¨λœ λ‹€λ₯Έ 방법(예: ν”„λ‘¬ν”„νŠΈ ν›ˆλ ¨ λ˜λŠ” ν”„λ‘¬ν”„νŠΈ νŠœλ‹) λ˜λŠ” 일반적인 πŸ€— PEFT λΌμ΄λΈŒλŸ¬λ¦¬μ— λŒ€ν•΄ μžμ„Ένžˆ μ•Œμ•„λ³΄λ €λ©΄ [λ¬Έμ„œ](https://huggingface.co/docs/peft/index)λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.
## PEFT μ–΄λŒ‘ν„° κ°€μ Έμ˜€κΈ° [[load-a-peft-adapter]]
πŸ€— Transformersμ—μ„œ PEFT μ–΄λŒ‘ν„° λͺ¨λΈμ„ κ°€μ Έμ˜€κ³  μ‚¬μš©ν•˜λ €λ©΄ Hub μ €μž₯μ†Œλ‚˜ 둜컬 디렉터리에 `adapter_config.json` 파일과 μ–΄λŒ‘ν„° κ°€μ€‘μΉ˜κ°€ ν¬ν•¨λ˜μ–΄ μžˆλŠ”μ§€ ν™•μΈν•˜μ‹­μ‹œμ˜€. 그런 λ‹€μŒ `AutoModelFor` 클래슀λ₯Ό μ‚¬μš©ν•˜μ—¬ PEFT μ–΄λŒ‘ν„° λͺ¨λΈμ„ κ°€μ Έμ˜¬ 수 μžˆμŠ΅λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄ 인과 관계 μ–Έμ–΄ λͺ¨λΈμš© PEFT μ–΄λŒ‘ν„° λͺ¨λΈμ„ κ°€μ Έμ˜€λ €λ©΄ λ‹€μŒ 단계λ₯Ό λ”°λ₯΄μ‹­μ‹œμ˜€:
1. PEFT λͺ¨λΈ IDλ₯Ό μ§€μ •ν•˜μ‹­μ‹œμ˜€.
2. [`AutoModelForCausalLM`] ν΄λž˜μŠ€μ— μ „λ‹¬ν•˜μ‹­μ‹œμ˜€.
```py
from transformers import AutoModelForCausalLM, AutoTokenizer
peft_model_id = "ybelkada/opt-350m-lora"
model = AutoModelForCausalLM.from_pretrained(peft_model_id)
```
<Tip>
`AutoModelFor` ν΄λž˜μŠ€λ‚˜ κΈ°λ³Έ λͺ¨λΈ 클래슀(예: `OPTForCausalLM` λ˜λŠ” `LlamaForCausalLM`) 쀑 ν•˜λ‚˜λ₯Ό μ‚¬μš©ν•˜μ—¬ PEFT μ–΄λŒ‘ν„°λ₯Ό κ°€μ Έμ˜¬ 수 μžˆμŠ΅λ‹ˆλ‹€.
</Tip>
`load_adapter` λ©”μ†Œλ“œλ₯Ό ν˜ΈμΆœν•˜μ—¬ PEFT μ–΄λŒ‘ν„°λ₯Ό κ°€μ Έμ˜¬ μˆ˜λ„ μžˆμŠ΅λ‹ˆλ‹€.
```py
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "facebook/opt-350m"
peft_model_id = "ybelkada/opt-350m-lora"
model = AutoModelForCausalLM.from_pretrained(model_id)
model.load_adapter(peft_model_id)
```
## 8λΉ„νŠΈ λ˜λŠ” 4λΉ„νŠΈλ‘œ κ°€μ Έμ˜€κΈ° [[load-in-8bit-or-4bit]]
`bitsandbytes` 톡합은 8λΉ„νŠΈμ™€ 4λΉ„νŠΈ 정밀도 데이터 μœ ν˜•μ„ μ§€μ›ν•˜λ―€λ‘œ 큰 λͺ¨λΈμ„ κ°€μ Έμ˜¬ λ•Œ μœ μš©ν•˜λ©΄μ„œ λ©”λͺ¨λ¦¬λ„ μ ˆμ•½ν•©λ‹ˆλ‹€. λͺ¨λΈμ„ ν•˜λ“œμ›¨μ–΄μ— 효과적으둜 λΆ„λ°°ν•˜λ €λ©΄ [`~PreTrainedModel.from_pretrained`]에 `load_in_8bit` λ˜λŠ” `load_in_4bit` λ§€κ°œλ³€μˆ˜λ₯Ό μΆ”κ°€ν•˜κ³  `device_map="auto"`λ₯Ό μ„€μ •ν•˜μ„Έμš”:
```py
from transformers import AutoModelForCausalLM, AutoTokenizer
peft_model_id = "ybelkada/opt-350m-lora"
model = AutoModelForCausalLM.from_pretrained(peft_model_id, device_map="auto", load_in_8bit=True)
```
## μƒˆ μ–΄λŒ‘ν„° μΆ”κ°€ [[add-a-new-adapter]]
μƒˆ μ–΄λŒ‘ν„°κ°€ ν˜„μž¬ μ–΄λŒ‘ν„°μ™€ λ™μΌν•œ μœ ν˜•μΈ κ²½μš°μ— ν•œν•΄ κΈ°μ‘΄ μ–΄λŒ‘ν„°κ°€ μžˆλŠ” λͺ¨λΈμ— μƒˆ μ–΄λŒ‘ν„°λ₯Ό μΆ”κ°€ν•˜λ €λ©΄ [`~peft.PeftModel.add_adapter`]λ₯Ό μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄ λͺ¨λΈμ— κΈ°μ‘΄ LoRA μ–΄λŒ‘ν„°κ°€ μ—°κ²°λ˜μ–΄ μžˆλŠ” 경우:
```py
from transformers import AutoModelForCausalLM, OPTForCausalLM, AutoTokenizer
from peft import PeftConfig
model_id = "facebook/opt-350m"
model = AutoModelForCausalLM.from_pretrained(model_id)
lora_config = LoraConfig(
target_modules=["q_proj", "k_proj"],
init_lora_weights=False
)
model.add_adapter(lora_config, adapter_name="adapter_1")
```
μƒˆ μ–΄λŒ‘ν„°λ₯Ό μΆ”κ°€ν•˜λ €λ©΄:
```py
# attach new adapter with same config
model.add_adapter(lora_config, adapter_name="adapter_2")
```
이제 [`~peft.PeftModel.set_adapter`]λ₯Ό μ‚¬μš©ν•˜μ—¬ μ–΄λŒ‘ν„°λ₯Ό μ‚¬μš©ν•  μ–΄λŒ‘ν„°λ‘œ μ„€μ •ν•  수 μžˆμŠ΅λ‹ˆλ‹€:
```py
# use adapter_1
model.set_adapter("adapter_1")
output = model.generate(**inputs)
print(tokenizer.decode(output_disabled[0], skip_special_tokens=True))
# use adapter_2
model.set_adapter("adapter_2")
output_enabled = model.generate(**inputs)
print(tokenizer.decode(output_enabled[0], skip_special_tokens=True))
```
## μ–΄λŒ‘ν„° ν™œμ„±ν™” 및 λΉ„ν™œμ„±ν™” [[enable-and-disable-adapters]]
λͺ¨λΈμ— μ–΄λŒ‘ν„°λ₯Ό μΆ”κ°€ν•œ ν›„ μ–΄λŒ‘ν„° λͺ¨λ“ˆμ„ ν™œμ„±ν™” λ˜λŠ” λΉ„ν™œμ„±ν™”ν•  수 μžˆμŠ΅λ‹ˆλ‹€. μ–΄λŒ‘ν„° λͺ¨λ“ˆμ„ ν™œμ„±ν™”ν•˜λ €λ©΄:
```py
from transformers import AutoModelForCausalLM, OPTForCausalLM, AutoTokenizer
from peft import PeftConfig
model_id = "facebook/opt-350m"
adapter_model_id = "ybelkada/opt-350m-lora"
tokenizer = AutoTokenizer.from_pretrained(model_id)
text = "Hello"
inputs = tokenizer(text, return_tensors="pt")
model = AutoModelForCausalLM.from_pretrained(model_id)
peft_config = PeftConfig.from_pretrained(adapter_model_id)
# to initiate with random weights
peft_config.init_lora_weights = False
model.add_adapter(peft_config)
model.enable_adapters()
output = model.generate(**inputs)
```
μ–΄λŒ‘ν„° λͺ¨λ“ˆμ„ λΉ„ν™œμ„±ν™”ν•˜λ €λ©΄:
```py
model.disable_adapters()
output = model.generate(**inputs)
```
## PEFT μ–΄λŒ‘ν„° ν›ˆλ ¨ [[train-a-peft-adapter]]
PEFT μ–΄λŒ‘ν„°λŠ” [`Trainer`] ν΄λž˜μŠ€μ—μ„œ μ§€μ›λ˜λ―€λ‘œ νŠΉμ • μ‚¬μš© 사둀에 맞게 μ–΄λŒ‘ν„°λ₯Ό ν›ˆλ ¨ν•  수 μžˆμŠ΅λ‹ˆλ‹€. λͺ‡ μ€„μ˜ μ½”λ“œλ₯Ό μΆ”κ°€ν•˜κΈ°λ§Œ ν•˜λ©΄ λ©λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄ LoRA μ–΄λŒ‘ν„°λ₯Ό ν›ˆλ ¨ν•˜λ €λ©΄:
<Tip>
[`Trainer`]λ₯Ό μ‚¬μš©ν•˜μ—¬ λͺ¨λΈμ„ λ―Έμ„Έ μ‘°μ •ν•˜λŠ” 것이 μ΅μˆ™ν•˜μ§€ μ•Šλ‹€λ©΄ [μ‚¬μ „ν›ˆλ ¨λœ λͺ¨λΈμ„ λ―Έμ„Έ μ‘°μ •ν•˜κΈ°](training) νŠœν† λ¦¬μ–Όμ„ ν™•μΈν•˜μ„Έμš”.
</Tip>
1. μž‘μ—… μœ ν˜• 및 ν•˜μ΄νΌνŒŒλΌλ―Έν„°λ₯Ό μ§€μ •ν•˜μ—¬ μ–΄λŒ‘ν„° ꡬ성을 μ •μ˜ν•©λ‹ˆλ‹€. ν•˜μ΄νΌνŒŒλΌλ―Έν„°μ— λŒ€ν•œ μžμ„Έν•œ λ‚΄μš©μ€ [`~peft.LoraConfig`]λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.
```py
from peft import LoraConfig
peft_config = LoraConfig(
lora_alpha=16,
lora_dropout=0.1,
r=64,
bias="none",
task_type="CAUSAL_LM",
)
```
2. λͺ¨λΈμ— μ–΄λŒ‘ν„°λ₯Ό μΆ”κ°€ν•©λ‹ˆλ‹€.
```py
model.add_adapter(peft_config)
```
3. 이제 λͺ¨λΈμ„ [`Trainer`]에 전달할 수 μžˆμŠ΅λ‹ˆλ‹€!
```py
trainer = Trainer(model=model, ...)
trainer.train()
```
ν›ˆλ ¨ν•œ μ–΄λŒ‘ν„°λ₯Ό μ €μž₯ν•˜κ³  λ‹€μ‹œ κ°€μ Έμ˜€λ €λ©΄:
```py
model.save_pretrained(save_dir)
model = AutoModelForCausalLM.from_pretrained(save_dir)
```