This is a quantized GGUF of mistralai/Mistral-Nemo-Instruct-2407. Requires llama.cpp newer than commit 50e0535 (7/22/2024) to run inference.

Currently, we just have a Q5_K quantization which comes in at 8.73 GB. If you're interested other quantizations, just ping me @iamlemec on Twitter.

Downloads last month
8
GGUF
Model size
12.2B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

5-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support