Spaces:
Paused
Paused
<!--Copyright 2020 The HuggingFace Team. All rights reserved. | |
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
specific language governing permissions and limitations under the License. | |
β οΈ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | |
rendered properly in your Markdown viewer. | |
--> | |
# ONNXλ‘ λ΄λ³΄λ΄κΈ° [[export-to-onnx]] | |
π€ Transformers λͺ¨λΈμ μ ν νκ²½μμ λ°°ν¬νκΈ° μν΄μλ λͺ¨λΈμ μ§λ ¬νλ νμμΌλ‘ λ΄λ³΄λ΄κ³ νΉμ λ°νμκ³Ό νλμ¨μ΄μμ λ‘λνκ³ μ€νν μ μμΌλ©΄ μ μ©ν©λλ€. | |
π€ Optimumμ Transformersμ νμ₯μΌλ‘, PyTorch λλ TensorFlowμμ λͺ¨λΈμ ONNXμ TFLiteμ κ°μ μ§λ ¬νλ νμμΌλ‘ λ΄λ³΄λΌ μ μλλ‘ νλ `exporters` λͺ¨λμ ν΅ν΄ μ 곡λ©λλ€. π€ Optimumμ λν μ±λ₯ μ΅μ ν λꡬ μΈνΈλ₯Ό μ 곡νμ¬ νΉμ νλμ¨μ΄μμ λͺ¨λΈμ νλ ¨νκ³ μ€νν λ μ΅λ ν¨μ¨μ±μ λ¬μ±ν μ μμ΅λλ€. | |
μ΄ μλ΄μλ π€ Optimumμ μ¬μ©νμ¬ π€ Transformers λͺ¨λΈμ ONNXλ‘ λ΄λ³΄λ΄λ λ°©λ²μ 보μ¬μ€λλ€. TFLiteλ‘ λͺ¨λΈμ λ΄λ³΄λ΄λ μλ΄μλ [TFLiteλ‘ λ΄λ³΄λ΄κΈ° νμ΄μ§](tflite)λ₯Ό μ°Έμ‘°νμΈμ. | |
## ONNXλ‘ λ΄λ³΄λ΄κΈ° [[export-to-onnx]] | |
[ONNX (Open Neural Network eXchange)](http://onnx.ai)λ PyTorchμ TensorFlowλ₯Ό ν¬ν¨ν λ€μν νλ μμν¬μμ μ¬μΈ΅ νμ΅ λͺ¨λΈμ λνλ΄λ λ° μ¬μ©λλ κ³΅ν΅ μ°μ°μ μΈνΈμ κ³΅ν΅ νμΌ νμμ μ μνλ μ€ν νμ€μ λλ€. λͺ¨λΈμ΄ ONNX νμμΌλ‘ λ΄λ³΄λ΄μ§λ©΄ μ΄λ¬ν μ°μ°μλ₯Ό μ¬μ©νμ¬ μ κ²½λ§μ ν΅ν΄ λ°μ΄ν°κ° νλ₯΄λ νλ¦μ λνλ΄λ κ³μ° κ·Έλν(μΌλ°μ μΌλ‘ _μ€κ° νν_μ΄λΌκ³ ν¨)κ° κ΅¬μ±λ©λλ€. | |
νμ€νλ μ°μ°μμ λ°μ΄ν° μ νμ κ°μ§ κ·Έλνλ₯Ό λ ΈμΆν¨μΌλ‘μ¨, ONNXλ νλ μμν¬ κ°μ μ½κ² μ νν μ μμ΅λλ€. μλ₯Ό λ€μ΄, PyTorchμμ νλ ¨λ λͺ¨λΈμ ONNX νμμΌλ‘ λ΄λ³΄λ΄κ³ TensorFlowμμ κ°μ Έμ¬ μ μμ΅λλ€(κ·Έ λ°λλ κ°λ₯ν©λλ€). | |
ONNX νμμΌλ‘ λ΄λ³΄λΈ λͺ¨λΈμ λ€μκ³Ό κ°μ΄ μ¬μ©ν μ μμ΅λλ€: | |
- [κ·Έλν μ΅μ ν](https://huggingface.co/docs/optimum/onnxruntime/usage_guides/optimization) λ° [μμν](https://huggingface.co/docs/optimum/onnxruntime/usage_guides/quantization)μ κ°μ κΈ°λ²μ μ¬μ©νμ¬ μΆλ‘ μ μν΄ μ΅μ νλ©λλ€. | |
- ONNX Runtimeμ ν΅ν΄ μ€νν μ μμ΅λλ€. [`ORTModelForXXX` ν΄λμ€λ€](https://huggingface.co/docs/optimum/onnxruntime/package_reference/modeling_ort)μ ν΅ν΄ λμΌν `AutoModel` APIλ₯Ό λ°λ¦ λλ€. μ΄ APIλ π€ Transformersμμ μ¬μ©νλ κ²κ³Ό λμΌν©λλ€. | |
- [μ΅μ νλ μΆλ‘ νμ΄νλΌμΈ](https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/pipelines)μ μ¬μ©ν μ μμ΅λλ€. μ΄λ π€ Transformersμ [`pipeline`] ν¨μμ λμΌν APIλ₯Ό κ°μ§κ³ μμ΅λλ€. | |
π€ Optimumμ κ΅¬μ± κ°μ²΄λ₯Ό νμ©νμ¬ ONNX λ΄λ³΄λ΄κΈ°λ₯Ό μ§μν©λλ€. μ΄λ¬ν κ΅¬μ± κ°μ²΄λ μ¬λ¬ λͺ¨λΈ μν€ν μ²μ λν΄ λ―Έλ¦¬ μ€λΉλμ΄ μμΌλ©° λ€λ₯Έ μν€ν μ²μ μ½κ² νμ₯ν μ μλλ‘ μ€κ³λμμ΅λλ€. | |
미리 μ€λΉλ κ΅¬μ± λͺ©λ‘μ [π€ Optimum λ¬Έμ](https://huggingface.co/docs/optimum/exporters/onnx/overview)λ₯Ό μ°Έμ‘°νμΈμ. | |
π€ Transformers λͺ¨λΈμ ONNXλ‘ λ΄λ³΄λ΄λ λ κ°μ§ λ°©λ²μ΄ μμ΅λλ€. μ¬κΈ°μμ λ κ°μ§ λ°©λ²μ λͺ¨λ 보μ¬μ€λλ€: | |
- π€ Optimumμ μ¬μ©νμ¬ CLIλ‘ λ΄λ³΄λ΄κΈ° | |
- `optimum.onnxruntime`μ μ¬μ©νμ¬ π€ OptimumμΌλ‘ ONNXλ‘ λ΄λ³΄λ΄κΈ° | |
### CLIλ₯Ό μ¬μ©νμ¬ π€ Transformers λͺ¨λΈμ ONNXλ‘ λ΄λ³΄λ΄κΈ° [[exporting-a-transformers-model-to-onnx-with-cli]] | |
π€ Transformers λͺ¨λΈμ ONNXλ‘ λ΄λ³΄λ΄λ €λ©΄ λ¨Όμ μΆκ° μ’ μμ±μ μ€μΉνμΈμ: | |
```bash | |
pip install optimum[exporters] | |
``` | |
μ¬μ© κ°λ₯ν λͺ¨λ μΈμλ₯Ό νμΈνλ €λ©΄ [π€ Optimum λ¬Έμ](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model#exporting-a-model-to-onnx-using-the-cli)λ₯Ό μ°Έμ‘°νκ±°λ λͺ λ Ήμ€μμ λμλ§μ 보μΈμ. | |
```bash | |
optimum-cli export onnx --help | |
``` | |
μλ₯Ό λ€μ΄, π€ Hubμμ `distilbert-base-uncased-distilled-squad`μ κ°μ λͺ¨λΈμ 체ν¬ν¬μΈνΈλ₯Ό λ΄λ³΄λ΄λ €λ©΄ λ€μ λͺ λ Ήμ μ€ννμΈμ: | |
```bash | |
optimum-cli export onnx --model distilbert-base-uncased-distilled-squad distilbert_base_uncased_squad_onnx/ | |
``` | |
μμ κ°μ΄ μ§ν μν©μ λνλ΄λ λ‘κ·Έκ° νμλκ³ κ²°κ³ΌμΈ `model.onnx`κ° μ μ₯λ μμΉκ° νμλ©λλ€. | |
```bash | |
Validating ONNX model distilbert_base_uncased_squad_onnx/model.onnx... | |
-[β] ONNX model output names match reference model (start_logits, end_logits) | |
- Validating ONNX Model output "start_logits": | |
-[β] (2, 16) matches (2, 16) | |
-[β] all values close (atol: 0.0001) | |
- Validating ONNX Model output "end_logits": | |
-[β] (2, 16) matches (2, 16) | |
-[β] all values close (atol: 0.0001) | |
The ONNX export succeeded and the exported model was saved at: distilbert_base_uncased_squad_onnx | |
``` | |
μμ μμ λ π€ Hubμμ 체ν¬ν¬μΈνΈλ₯Ό λ΄λ³΄λ΄λ κ²μ μ€λͺ ν©λλ€. λ‘컬 λͺ¨λΈμ λ΄λ³΄λΌ λμλ λͺ¨λΈμ κ°μ€μΉμ ν ν¬λμ΄μ νμΌμ λμΌν λλ ν 리(`local_path`)μ μ μ₯νλμ§ νμΈνμΈμ. CLIλ₯Ό μ¬μ©ν λμλ π€ Hubμ 체ν¬ν¬μΈνΈ μ΄λ¦ λμ `model` μΈμμ `local_path`λ₯Ό μ λ¬νκ³ `--task` μΈμλ₯Ό μ 곡νμΈμ. μ§μλλ μμ μ λͺ©λ‘μ [π€ Optimum λ¬Έμ](https://huggingface.co/docs/optimum/exporters/task_manager)λ₯Ό μ°Έμ‘°νμΈμ. `task` μΈμκ° μ 곡λμ§ μμΌλ©΄ μμ μ νΉνλ ν€λ μμ΄ λͺ¨λΈ μν€ν μ²λ‘ κΈ°λ³Έ μ€μ λ©λλ€. | |
```bash | |
optimum-cli export onnx --model local_path --task question-answering distilbert_base_uncased_squad_onnx/ | |
``` | |
κ·Έ κ²°κ³Όλ‘ μμ±λ `model.onnx` νμΌμ ONNX νμ€μ μ§μνλ λ§μ [κ°μκΈ°](https://onnx.ai/supported-tools.html#deployModel) μ€ νλμμ μ€νν μ μμ΅λλ€. μλ₯Ό λ€μ΄, [ONNX Runtime](https://onnxruntime.ai/)μ μ¬μ©νμ¬ λͺ¨λΈμ λ‘λνκ³ μ€νν μ μμ΅λλ€: | |
```python | |
>>> from transformers import AutoTokenizer | |
>>> from optimum.onnxruntime import ORTModelForQuestionAnswering | |
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert_base_uncased_squad_onnx") | |
>>> model = ORTModelForQuestionAnswering.from_pretrained("distilbert_base_uncased_squad_onnx") | |
>>> inputs = tokenizer("What am I using?", "Using DistilBERT with ONNX Runtime!", return_tensors="pt") | |
>>> outputs = model(**inputs) | |
``` | |
Hubμ TensorFlow 체ν¬ν¬μΈνΈμ λν΄μλ λμΌν νλ‘μΈμ€κ° μ μ©λ©λλ€. μλ₯Ό λ€μ΄, [Keras organization](https://huggingface.co/keras-io)μμ μμν TensorFlow 체ν¬ν¬μΈνΈλ₯Ό λ΄λ³΄λ΄λ λ°©λ²μ λ€μκ³Ό κ°μ΅λλ€: | |
```bash | |
optimum-cli export onnx --model keras-io/transformers-qa distilbert_base_cased_squad_onnx/ | |
``` | |
### `optimum.onnxruntime`μ μ¬μ©νμ¬ π€ Transformers λͺ¨λΈμ ONNXλ‘ λ΄λ³΄λ΄κΈ° [[exporting-a-transformers-model-to-onnx-with-optimumonnxruntime]] | |
CLI λμ μ `optimum.onnxruntime`μ μ¬μ©νμ¬ νλ‘κ·Έλλ° λ°©μμΌλ‘ π€ Transformers λͺ¨λΈμ ONNXλ‘ λ΄λ³΄λΌ μλ μμ΅λλ€. λ€μκ³Ό κ°μ΄ μ§ννμΈμ: | |
```python | |
>>> from optimum.onnxruntime import ORTModelForSequenceClassification | |
>>> from transformers import AutoTokenizer | |
>>> model_checkpoint = "distilbert_base_uncased_squad" | |
>>> save_directory = "onnx/" | |
>>> # Load a model from transformers and export it to ONNX | |
>>> ort_model = ORTModelForSequenceClassification.from_pretrained(model_checkpoint, export=True) | |
>>> tokenizer = AutoTokenizer.from_pretrained(model_checkpoint) | |
>>> # Save the onnx model and tokenizer | |
>>> ort_model.save_pretrained(save_directory) | |
>>> tokenizer.save_pretrained(save_directory) | |
``` | |
### μ§μλμ§ μλ μν€ν μ²μ λͺ¨λΈ λ΄λ³΄λ΄κΈ° [[exporting-a-model-for-an-unsupported-architecture]] | |
νμ¬ λ΄λ³΄λΌ μ μλ λͺ¨λΈμ μ§μνκΈ° μν΄ κΈ°μ¬νλ €λ©΄, λ¨Όμ [`optimum.exporters.onnx`](https://huggingface.co/docs/optimum/exporters/onnx/overview)μμ μ§μλλμ§ νμΈν ν μ§μλμ§ μλ κ²½μ°μλ [π€ Optimumμ κΈ°μ¬](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/contribute)νμΈμ. | |
### `transformers.onnx`λ₯Ό μ¬μ©νμ¬ λͺ¨λΈ λ΄λ³΄λ΄κΈ° [[exporting-a-model-with-transformersonnx]] | |
<Tip warning={true}> | |
`tranformers.onnx`λ λ μ΄μ μ μ§λμ§ μμ΅λλ€. μμμ μ€λͺ ν λλ‘ π€ Optimumμ μ¬μ©νμ¬ λͺ¨λΈμ λ΄λ³΄λ΄μΈμ. μ΄ μΉμ μ ν₯ν λ²μ μμ μ κ±°λ μμ μ λλ€. | |
</Tip> | |
π€ Transformers λͺ¨λΈμ ONNXλ‘ λ΄λ³΄λ΄λ €λ©΄ μΆκ° μ’ μμ±μ μ€μΉνμΈμ: | |
```bash | |
pip install transformers[onnx] | |
``` | |
`transformers.onnx` ν¨ν€μ§λ₯Ό Python λͺ¨λλ‘ μ¬μ©νμ¬ μ€λΉλ ꡬμ±μ μ¬μ©νμ¬ μ²΄ν¬ν¬μΈνΈλ₯Ό λ΄λ³΄λ λλ€: | |
```bash | |
python -m transformers.onnx --model=distilbert-base-uncased onnx/ | |
``` | |
μ΄λ κ² νλ©΄ `--model` μΈμμ μ μλ 체ν¬ν¬μΈνΈμ ONNX κ·Έλνκ° λ΄λ³΄λ΄μ§λλ€. π€ Hubμμ μ 곡νλ 체ν¬ν¬μΈνΈλ λ‘컬μ μ μ₯λ 체ν¬ν¬μΈνΈλ₯Ό μ λ¬ν μ μμ΅λλ€. κ²°κ³Όλ‘ μμ±λ `model.onnx` νμΌμ ONNX νμ€μ μ§μνλ λ§μ κ°μκΈ° μ€ νλμμ μ€νν μ μμ΅λλ€. μλ₯Ό λ€μ΄, λ€μκ³Ό κ°μ΄ ONNX Runtimeμ μ¬μ©νμ¬ λͺ¨λΈμ λ‘λνκ³ μ€νν μ μμ΅λλ€: | |
```python | |
>>> from transformers import AutoTokenizer | |
>>> from onnxruntime import InferenceSession | |
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") | |
>>> session = InferenceSession("onnx/model.onnx") | |
>>> # ONNX Runtime expects NumPy arrays as input | |
>>> inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="np") | |
>>> outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs)) | |
``` | |
νμν μΆλ ₯ μ΄λ¦(μ: `["last_hidden_state"]`)μ κ° λͺ¨λΈμ ONNX ꡬμ±μ νμΈνμ¬ μ»μ μ μμ΅λλ€. μλ₯Ό λ€μ΄, DistilBERTμ κ²½μ° λ€μκ³Ό κ°μ΅λλ€: | |
```python | |
>>> from transformers.models.distilbert import DistilBertConfig, DistilBertOnnxConfig | |
>>> config = DistilBertConfig() | |
>>> onnx_config = DistilBertOnnxConfig(config) | |
>>> print(list(onnx_config.outputs.keys())) | |
["last_hidden_state"] | |
``` | |
Hubμ TensorFlow 체ν¬ν¬μΈνΈμ λν΄μλ λμΌν νλ‘μΈμ€κ° μ μ©λ©λλ€. μλ₯Ό λ€μ΄, λ€μκ³Ό κ°μ΄ μμν TensorFlow 체ν¬ν¬μΈνΈλ₯Ό λ΄λ³΄λ λλ€: | |
```bash | |
python -m transformers.onnx --model=keras-io/transformers-qa onnx/ | |
``` | |
λ‘컬μ μ μ₯λ λͺ¨λΈμ λ΄λ³΄λ΄λ €λ©΄ λͺ¨λΈμ κ°μ€μΉ νμΌκ³Ό ν ν¬λμ΄μ νμΌμ λμΌν λλ ν 리μ μ μ₯ν λ€μ, transformers.onnx ν¨ν€μ§μ --model μΈμλ₯Ό μνλ λλ ν λ¦¬λ‘ μ§μ νμ¬ ONNXλ‘ λ΄λ³΄λ λλ€: | |
```bash | |
python -m transformers.onnx --model=local-pt-checkpoint onnx/ | |
``` |