parinitarahi commited on
Commit
a08bc95
·
verified ·
1 Parent(s): 8f74cff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md CHANGED
@@ -19,8 +19,52 @@ Here are some of the optimized configurations we have added:
19
  1. ONNX model for int4 CPU and Mobile: ONNX model for CPU and mobile using int4 quantization via RTN.
20
  2. ONNX model for int4 CUDA and DML GPU devices using int4 quantization via RTN.
21
 
 
22
  You can see how to run examples with ORT GenAI [here](https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/phi-3-tutorial.md)
23
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
  ## Model Description
25
  - Developed by: Microsoft
26
  - Model type: ONNX
 
19
  1. ONNX model for int4 CPU and Mobile: ONNX model for CPU and mobile using int4 quantization via RTN.
20
  2. ONNX model for int4 CUDA and DML GPU devices using int4 quantization via RTN.
21
 
22
+ ## Model Run
23
  You can see how to run examples with ORT GenAI [here](https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/phi-3-tutorial.md)
24
 
25
+ For CPU:
26
+
27
+ ```bash
28
+ # Download the model directly using the Hugging Face CLI
29
+ huggingface-cli download microsoft/Phi-4-mini-instruct-onnx/ --include Phi-4-mini-instruct-onnx/cpu_and_mobile/* --local-dir .
30
+
31
+ # Install the CPU package of ONNX Runtime GenAI
32
+ pip install onnxruntime-genai
33
+
34
+ # Please adjust the model directory (-m) accordingly
35
+ curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py -o phi3-qa.py
36
+ python phi3-qa.py -m cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4 -e cpu
37
+ ```
38
+
39
+ For CUDA:
40
+
41
+ ```bash
42
+ # Download the model directly using the Hugging Face CLI
43
+ huggingface-cli download onnxruntime/Phi-4-mini-instruct-onnx --include Phi-4-mini-instruct-onnx/gpu/* --local-dir .
44
+
45
+ # Install the CUDA package of ONNX Runtime GenAI
46
+ pip install onnxruntime-genai-cuda
47
+
48
+ # Please adjust the model directory (-m) accordingly
49
+ curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py -o phi3-qa.py
50
+ python phi3-qa.py -m gpu/gpu-int4-rtn-block-32 -e cuda
51
+ ```
52
+
53
+ For DirectML:
54
+
55
+ ```bash
56
+ # Download the model directly using the Hugging Face CLI
57
+ huggingface-cli download onnxruntime/Phi-4-mini-instruct-onnx --include Phi-4-mini-instruct-onnx/gpu/* --local-dir .
58
+
59
+ # Install the CUDA package of ONNX Runtime GenAI
60
+ pip install onnxruntime-genai-cuda
61
+
62
+ # Please adjust the model directory (-m) accordingly
63
+ curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py -o phi3-qa.py
64
+ python phi3-qa.py -m gpu/gpu-int4-rtn-block-32 -e dml
65
+ ```
66
+
67
+
68
  ## Model Description
69
  - Developed by: Microsoft
70
  - Model type: ONNX