Update README.md
Browse files
README.md
CHANGED
@@ -23,7 +23,7 @@ With Red Hat AI you can,
|
|
23 |
- Leverage quantized variants of the leading open source models such as Llama, Mistral, Granite, DeepSeek, Qwen, Gemma, Phi, and many more.
|
24 |
- Tune smaller, purpose-built models with your own data.
|
25 |
- Quantize your models with [LLM Compressor](https://github.com/vllm-project/llm-compressor) or use our pre-optimized models on HuggingFace.
|
26 |
-
- Optimize inference with [vLLM](https://github.com/vllm-project/vllm) across any hardware and deployment
|
27 |
|
28 |
We provide accurate model checkpoints compressed with SOTA methods ready to run in vLLM such as W4A16, W8A16, W8A8 (int8 and fp8), and many more!
|
29 |
If you would like help quantizing a model or have a request for us to add a checkpoint, please open an issue in https://github.com/vllm-project/llm-compressor.
|
|
|
23 |
- Leverage quantized variants of the leading open source models such as Llama, Mistral, Granite, DeepSeek, Qwen, Gemma, Phi, and many more.
|
24 |
- Tune smaller, purpose-built models with your own data.
|
25 |
- Quantize your models with [LLM Compressor](https://github.com/vllm-project/llm-compressor) or use our pre-optimized models on HuggingFace.
|
26 |
+
- Optimize inference with [vLLM](https://github.com/vllm-project/vllm) across any hardware and deployment scenarios.
|
27 |
|
28 |
We provide accurate model checkpoints compressed with SOTA methods ready to run in vLLM such as W4A16, W8A16, W8A8 (int8 and fp8), and many more!
|
29 |
If you would like help quantizing a model or have a request for us to add a checkpoint, please open an issue in https://github.com/vllm-project/llm-compressor.
|