IQ3 quants of NousResearch/Nous-Hermes-2-Yi-34B

Created using llama.cpp 9e359a4f, with default settings of both convert.py and quantize using the imatrix provided by ikawrakow.

See https://github.com/ggerganov/llama.cpp/pull/5676 for information on the IQ3 quantization.

Downloads last month
15
GGUF
Model size
34.4B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

3-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for patf82/Nous-Hermes-2-Yi-34B-IQ3-imatrix-GGUF

Base model

01-ai/Yi-34B
Quantized
(8)
this model

Dataset used to train patf82/Nous-Hermes-2-Yi-34B-IQ3-imatrix-GGUF