ahassoun's picture
Upload 3018 files
ee6e328
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# ONNX둜 내보내기 [[export-to-onnx]]
πŸ€— Transformers λͺ¨λΈμ„ μ œν’ˆ ν™˜κ²½μ—μ„œ λ°°ν¬ν•˜κΈ° μœ„ν•΄μ„œλŠ” λͺ¨λΈμ„ μ§λ ¬ν™”λœ ν˜•μ‹μœΌλ‘œ 내보내고 νŠΉμ • λŸ°νƒ€μž„κ³Ό ν•˜λ“œμ›¨μ–΄μ—μ„œ λ‘œλ“œν•˜κ³  μ‹€ν–‰ν•  수 있으면 μœ μš©ν•©λ‹ˆλ‹€.
πŸ€— Optimum은 Transformers의 ν™•μž₯으둜, PyTorch λ˜λŠ” TensorFlowμ—μ„œ λͺ¨λΈμ„ ONNX와 TFLite와 같은 μ§λ ¬ν™”λœ ν˜•μ‹μœΌλ‘œ 내보낼 수 μžˆλ„λ‘ ν•˜λŠ” `exporters` λͺ¨λ“ˆμ„ 톡해 μ œκ³΅λ©λ‹ˆλ‹€. πŸ€— Optimum은 λ˜ν•œ μ„±λŠ₯ μ΅œμ ν™” 도ꡬ μ„ΈνŠΈλ₯Ό μ œκ³΅ν•˜μ—¬ νŠΉμ • ν•˜λ“œμ›¨μ–΄μ—μ„œ λͺ¨λΈμ„ ν›ˆλ ¨ν•˜κ³  μ‹€ν–‰ν•  λ•Œ μ΅œλŒ€ νš¨μœ¨μ„±μ„ 달성할 수 μžˆμŠ΅λ‹ˆλ‹€.
이 μ•ˆλ‚΄μ„œλŠ” πŸ€— Optimum을 μ‚¬μš©ν•˜μ—¬ πŸ€— Transformers λͺ¨λΈμ„ ONNX둜 λ‚΄λ³΄λ‚΄λŠ” 방법을 λ³΄μ—¬μ€λ‹ˆλ‹€. TFLite둜 λͺ¨λΈμ„ λ‚΄λ³΄λ‚΄λŠ” μ•ˆλ‚΄μ„œλŠ” [TFLite둜 내보내기 νŽ˜μ΄μ§€](tflite)λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.
## ONNX둜 내보내기 [[export-to-onnx]]
[ONNX (Open Neural Network eXchange)](http://onnx.ai)λŠ” PyTorch와 TensorFlowλ₯Ό ν¬ν•¨ν•œ λ‹€μ–‘ν•œ ν”„λ ˆμž„μ›Œν¬μ—μ„œ 심측 ν•™μŠ΅ λͺ¨λΈμ„ λ‚˜νƒ€λ‚΄λŠ” 데 μ‚¬μš©λ˜λŠ” 곡톡 μ—°μ‚°μž μ„ΈνŠΈμ™€ 곡톡 파일 ν˜•μ‹μ„ μ •μ˜ν•˜λŠ” μ˜€ν”ˆ ν‘œμ€€μž…λ‹ˆλ‹€. λͺ¨λΈμ΄ ONNX ν˜•μ‹μœΌλ‘œ 내보내지면 μ΄λŸ¬ν•œ μ—°μ‚°μžλ₯Ό μ‚¬μš©ν•˜μ—¬ 신경망을 톡해 데이터가 흐λ₯΄λŠ” 흐름을 λ‚˜νƒ€λ‚΄λŠ” 계산 κ·Έλž˜ν”„(일반적으둜 _쀑간 ν‘œν˜„_이라고 함)κ°€ κ΅¬μ„±λ©λ‹ˆλ‹€.
ν‘œμ€€ν™”λœ μ—°μ‚°μžμ™€ 데이터 μœ ν˜•μ„ κ°€μ§„ κ·Έλž˜ν”„λ₯Ό λ…ΈμΆœν•¨μœΌλ‘œμ¨, ONNXλŠ” ν”„λ ˆμž„μ›Œν¬ 간에 μ‰½κ²Œ μ „ν™˜ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄, PyTorchμ—μ„œ ν›ˆλ ¨λœ λͺ¨λΈμ„ ONNX ν˜•μ‹μœΌλ‘œ 내보내고 TensorFlowμ—μ„œ κ°€μ Έμ˜¬ 수 μžˆμŠ΅λ‹ˆλ‹€(κ·Έ λ°˜λŒ€λ„ κ°€λŠ₯ν•©λ‹ˆλ‹€).
ONNX ν˜•μ‹μœΌλ‘œ 내보낸 λͺ¨λΈμ€ λ‹€μŒκ³Ό 같이 μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€:
- [κ·Έλž˜ν”„ μ΅œμ ν™”](https://huggingface.co/docs/optimum/onnxruntime/usage_guides/optimization) 및 [μ–‘μžν™”](https://huggingface.co/docs/optimum/onnxruntime/usage_guides/quantization)와 같은 기법을 μ‚¬μš©ν•˜μ—¬ 좔둠을 μœ„ν•΄ μ΅œμ ν™”λ©λ‹ˆλ‹€.
- ONNX Runtime을 톡해 μ‹€ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€. [`ORTModelForXXX` ν΄λž˜μŠ€λ“€](https://huggingface.co/docs/optimum/onnxruntime/package_reference/modeling_ort)을 톡해 λ™μΌν•œ `AutoModel` APIλ₯Ό λ”°λ¦…λ‹ˆλ‹€. 이 APIλŠ” πŸ€— Transformersμ—μ„œ μ‚¬μš©ν•˜λŠ” 것과 λ™μΌν•©λ‹ˆλ‹€.
- [μ΅œμ ν™”λœ μΆ”λ‘  νŒŒμ΄ν”„λΌμΈ](https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/pipelines)을 μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€. μ΄λŠ” πŸ€— Transformers의 [`pipeline`] ν•¨μˆ˜μ™€ λ™μΌν•œ APIλ₯Ό κ°€μ§€κ³  μžˆμŠ΅λ‹ˆλ‹€.
πŸ€— Optimum은 ꡬ성 객체λ₯Ό ν™œμš©ν•˜μ—¬ ONNX 내보내기λ₯Ό μ§€μ›ν•©λ‹ˆλ‹€. μ΄λŸ¬ν•œ ꡬ성 κ°μ²΄λŠ” μ—¬λŸ¬ λͺ¨λΈ μ•„ν‚€ν…μ²˜μ— λŒ€ν•΄ 미리 μ€€λΉ„λ˜μ–΄ 있으며 λ‹€λ₯Έ μ•„ν‚€ν…μ²˜μ— μ‰½κ²Œ ν™•μž₯ν•  수 μžˆλ„λ‘ μ„€κ³„λ˜μ—ˆμŠ΅λ‹ˆλ‹€.
미리 μ€€λΉ„λœ ꡬ성 λͺ©λ‘μ€ [πŸ€— Optimum λ¬Έμ„œ](https://huggingface.co/docs/optimum/exporters/onnx/overview)λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.
πŸ€— Transformers λͺ¨λΈμ„ ONNX둜 λ‚΄λ³΄λ‚΄λŠ” 두 κ°€μ§€ 방법이 μžˆμŠ΅λ‹ˆλ‹€. μ—¬κΈ°μ—μ„œ 두 κ°€μ§€ 방법을 λͺ¨λ‘ λ³΄μ—¬μ€λ‹ˆλ‹€:
- πŸ€— Optimum을 μ‚¬μš©ν•˜μ—¬ CLI둜 내보내기
- `optimum.onnxruntime`을 μ‚¬μš©ν•˜μ—¬ πŸ€— Optimum으둜 ONNX둜 내보내기
### CLIλ₯Ό μ‚¬μš©ν•˜μ—¬ πŸ€— Transformers λͺ¨λΈμ„ ONNX둜 내보내기 [[exporting-a-transformers-model-to-onnx-with-cli]]
πŸ€— Transformers λͺ¨λΈμ„ ONNX둜 내보내렀면 λ¨Όμ € μΆ”κ°€ 쒅속성을 μ„€μΉ˜ν•˜μ„Έμš”:
```bash
pip install optimum[exporters]
```
μ‚¬μš© κ°€λŠ₯ν•œ λͺ¨λ“  인수λ₯Ό ν™•μΈν•˜λ €λ©΄ [πŸ€— Optimum λ¬Έμ„œ](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model#exporting-a-model-to-onnx-using-the-cli)λ₯Ό μ°Έμ‘°ν•˜κ±°λ‚˜ λͺ…λ Ήμ€„μ—μ„œ 도움말을 λ³΄μ„Έμš”.
```bash
optimum-cli export onnx --help
```
예λ₯Ό λ“€μ–΄, πŸ€— Hubμ—μ„œ `distilbert-base-uncased-distilled-squad`와 같은 λͺ¨λΈμ˜ 체크포인트λ₯Ό 내보내렀면 λ‹€μŒ λͺ…령을 μ‹€ν–‰ν•˜μ„Έμš”:
```bash
optimum-cli export onnx --model distilbert-base-uncased-distilled-squad distilbert_base_uncased_squad_onnx/
```
μœ„μ™€ 같이 μ§„ν–‰ 상황을 λ‚˜νƒ€λ‚΄λŠ” λ‘œκ·Έκ°€ ν‘œμ‹œλ˜κ³  결과인 `model.onnx`κ°€ μ €μž₯된 μœ„μΉ˜κ°€ ν‘œμ‹œλ©λ‹ˆλ‹€.
```bash
Validating ONNX model distilbert_base_uncased_squad_onnx/model.onnx...
-[βœ“] ONNX model output names match reference model (start_logits, end_logits)
- Validating ONNX Model output "start_logits":
-[βœ“] (2, 16) matches (2, 16)
-[βœ“] all values close (atol: 0.0001)
- Validating ONNX Model output "end_logits":
-[βœ“] (2, 16) matches (2, 16)
-[βœ“] all values close (atol: 0.0001)
The ONNX export succeeded and the exported model was saved at: distilbert_base_uncased_squad_onnx
```
μœ„μ˜ μ˜ˆμ œλŠ” πŸ€— Hubμ—μ„œ 체크포인트λ₯Ό λ‚΄λ³΄λ‚΄λŠ” 것을 μ„€λͺ…ν•©λ‹ˆλ‹€. 둜컬 λͺ¨λΈμ„ 내보낼 λ•Œμ—λŠ” λͺ¨λΈμ˜ κ°€μ€‘μΉ˜μ™€ ν† ν¬λ‚˜μ΄μ € νŒŒμΌμ„ λ™μΌν•œ 디렉토리(`local_path`)에 μ €μž₯ν–ˆλŠ”μ§€ ν™•μΈν•˜μ„Έμš”. CLIλ₯Ό μ‚¬μš©ν•  λ•Œμ—λŠ” πŸ€— Hub의 체크포인트 이름 λŒ€μ‹  `model` μΈμˆ˜μ— `local_path`λ₯Ό μ „λ‹¬ν•˜κ³  `--task` 인수λ₯Ό μ œκ³΅ν•˜μ„Έμš”. μ§€μ›λ˜λŠ” μž‘μ—…μ˜ λͺ©λ‘μ€ [πŸ€— Optimum λ¬Έμ„œ](https://huggingface.co/docs/optimum/exporters/task_manager)λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”. `task` μΈμˆ˜κ°€ μ œκ³΅λ˜μ§€ μ•ŠμœΌλ©΄ μž‘μ—…μ— νŠΉν™”λœ ν—€λ“œ 없이 λͺ¨λΈ μ•„ν‚€ν…μ²˜λ‘œ κΈ°λ³Έ μ„€μ •λ©λ‹ˆλ‹€.
```bash
optimum-cli export onnx --model local_path --task question-answering distilbert_base_uncased_squad_onnx/
```
κ·Έ 결과둜 μƒμ„±λœ `model.onnx` νŒŒμΌμ€ ONNX ν‘œμ€€μ„ μ§€μ›ν•˜λŠ” λ§Žμ€ [가속기](https://onnx.ai/supported-tools.html#deployModel) 쀑 ν•˜λ‚˜μ—μ„œ μ‹€ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄, [ONNX Runtime](https://onnxruntime.ai/)을 μ‚¬μš©ν•˜μ—¬ λͺ¨λΈμ„ λ‘œλ“œν•˜κ³  μ‹€ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€:
```python
>>> from transformers import AutoTokenizer
>>> from optimum.onnxruntime import ORTModelForQuestionAnswering
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert_base_uncased_squad_onnx")
>>> model = ORTModelForQuestionAnswering.from_pretrained("distilbert_base_uncased_squad_onnx")
>>> inputs = tokenizer("What am I using?", "Using DistilBERT with ONNX Runtime!", return_tensors="pt")
>>> outputs = model(**inputs)
```
Hub의 TensorFlow μ²΄ν¬ν¬μΈνŠΈμ— λŒ€ν•΄μ„œλ„ λ™μΌν•œ ν”„λ‘œμ„ΈμŠ€κ°€ μ μš©λ©λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄, [Keras organization](https://huggingface.co/keras-io)μ—μ„œ μˆœμˆ˜ν•œ TensorFlow 체크포인트λ₯Ό λ‚΄λ³΄λ‚΄λŠ” 방법은 λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€:
```bash
optimum-cli export onnx --model keras-io/transformers-qa distilbert_base_cased_squad_onnx/
```
### `optimum.onnxruntime`을 μ‚¬μš©ν•˜μ—¬ πŸ€— Transformers λͺ¨λΈμ„ ONNX둜 내보내기 [[exporting-a-transformers-model-to-onnx-with-optimumonnxruntime]]
CLI λŒ€μ‹ μ— `optimum.onnxruntime`을 μ‚¬μš©ν•˜μ—¬ ν”„λ‘œκ·Έλž˜λ° λ°©μ‹μœΌλ‘œ πŸ€— Transformers λͺ¨λΈμ„ ONNX둜 내보낼 μˆ˜λ„ μžˆμŠ΅λ‹ˆλ‹€. λ‹€μŒκ³Ό 같이 μ§„ν–‰ν•˜μ„Έμš”:
```python
>>> from optimum.onnxruntime import ORTModelForSequenceClassification
>>> from transformers import AutoTokenizer
>>> model_checkpoint = "distilbert_base_uncased_squad"
>>> save_directory = "onnx/"
>>> # Load a model from transformers and export it to ONNX
>>> ort_model = ORTModelForSequenceClassification.from_pretrained(model_checkpoint, export=True)
>>> tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
>>> # Save the onnx model and tokenizer
>>> ort_model.save_pretrained(save_directory)
>>> tokenizer.save_pretrained(save_directory)
```
### μ§€μ›λ˜μ§€ μ•ŠλŠ” μ•„ν‚€ν…μ²˜μ˜ λͺ¨λΈ 내보내기 [[exporting-a-model-for-an-unsupported-architecture]]
ν˜„μž¬ 내보낼 수 μ—†λŠ” λͺ¨λΈμ„ μ§€μ›ν•˜κΈ° μœ„ν•΄ κΈ°μ—¬ν•˜λ €λ©΄, λ¨Όμ € [`optimum.exporters.onnx`](https://huggingface.co/docs/optimum/exporters/onnx/overview)μ—μ„œ μ§€μ›λ˜λŠ”μ§€ ν™•μΈν•œ ν›„ μ§€μ›λ˜μ§€ μ•ŠλŠ” κ²½μš°μ—λŠ” [πŸ€— Optimum에 κΈ°μ—¬](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/contribute)ν•˜μ„Έμš”.
### `transformers.onnx`λ₯Ό μ‚¬μš©ν•˜μ—¬ λͺ¨λΈ 내보내기 [[exporting-a-model-with-transformersonnx]]
<Tip warning={true}>
`tranformers.onnx`λŠ” 더 이상 μœ μ§€λ˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€. μœ„μ—μ„œ μ„€λͺ…ν•œ λŒ€λ‘œ πŸ€— Optimum을 μ‚¬μš©ν•˜μ—¬ λͺ¨λΈμ„ λ‚΄λ³΄λ‚΄μ„Έμš”. 이 μ„Ήμ…˜μ€ ν–₯ν›„ λ²„μ „μ—μ„œ 제거될 μ˜ˆμ •μž…λ‹ˆλ‹€.
</Tip>
πŸ€— Transformers λͺ¨λΈμ„ ONNX둜 내보내렀면 μΆ”κ°€ 쒅속성을 μ„€μΉ˜ν•˜μ„Έμš”:
```bash
pip install transformers[onnx]
```
`transformers.onnx` νŒ¨ν‚€μ§€λ₯Ό Python λͺ¨λ“ˆλ‘œ μ‚¬μš©ν•˜μ—¬ μ€€λΉ„λœ ꡬ성을 μ‚¬μš©ν•˜μ—¬ 체크포인트λ₯Ό λ‚΄λ³΄λƒ…λ‹ˆλ‹€:
```bash
python -m transformers.onnx --model=distilbert-base-uncased onnx/
```
μ΄λ ‡κ²Œ ν•˜λ©΄ `--model` μΈμˆ˜μ— μ •μ˜λœ 체크포인트의 ONNX κ·Έλž˜ν”„κ°€ λ‚΄λ³΄λ‚΄μ§‘λ‹ˆλ‹€. πŸ€— Hubμ—μ„œ μ œκ³΅ν•˜λŠ” μ²΄ν¬ν¬μΈνŠΈλ‚˜ λ‘œμ»¬μ— μ €μž₯된 체크포인트λ₯Ό 전달할 수 μžˆμŠ΅λ‹ˆλ‹€. 결과둜 μƒμ„±λœ `model.onnx` νŒŒμΌμ€ ONNX ν‘œμ€€μ„ μ§€μ›ν•˜λŠ” λ§Žμ€ 가속기 쀑 ν•˜λ‚˜μ—μ„œ μ‹€ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄, λ‹€μŒκ³Ό 같이 ONNX Runtime을 μ‚¬μš©ν•˜μ—¬ λͺ¨λΈμ„ λ‘œλ“œν•˜κ³  μ‹€ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€:
```python
>>> from transformers import AutoTokenizer
>>> from onnxruntime import InferenceSession
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
>>> session = InferenceSession("onnx/model.onnx")
>>> # ONNX Runtime expects NumPy arrays as input
>>> inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="np")
>>> outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs))
```
ν•„μš”ν•œ 좜λ ₯ 이름(예: `["last_hidden_state"]`)은 각 λͺ¨λΈμ˜ ONNX ꡬ성을 ν™•μΈν•˜μ—¬ 얻을 수 μžˆμŠ΅λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄, DistilBERT의 경우 λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€:
```python
>>> from transformers.models.distilbert import DistilBertConfig, DistilBertOnnxConfig
>>> config = DistilBertConfig()
>>> onnx_config = DistilBertOnnxConfig(config)
>>> print(list(onnx_config.outputs.keys()))
["last_hidden_state"]
```
Hub의 TensorFlow μ²΄ν¬ν¬μΈνŠΈμ— λŒ€ν•΄μ„œλ„ λ™μΌν•œ ν”„λ‘œμ„ΈμŠ€κ°€ μ μš©λ©λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄, λ‹€μŒκ³Ό 같이 μˆœμˆ˜ν•œ TensorFlow 체크포인트λ₯Ό λ‚΄λ³΄λƒ…λ‹ˆλ‹€:
```bash
python -m transformers.onnx --model=keras-io/transformers-qa onnx/
```
λ‘œμ»¬μ— μ €μž₯된 λͺ¨λΈμ„ 내보내렀면 λͺ¨λΈμ˜ κ°€μ€‘μΉ˜ 파일과 ν† ν¬λ‚˜μ΄μ € νŒŒμΌμ„ λ™μΌν•œ 디렉토리에 μ €μž₯ν•œ λ‹€μŒ, transformers.onnx νŒ¨ν‚€μ§€μ˜ --model 인수λ₯Ό μ›ν•˜λŠ” λ””λ ‰ν† λ¦¬λ‘œ μ§€μ •ν•˜μ—¬ ONNX둜 λ‚΄λ³΄λƒ…λ‹ˆλ‹€:
```bash
python -m transformers.onnx --model=local-pt-checkpoint onnx/
```