RedHatAI
/

Mistral-Small-3.1-24B-Instruct-2503-quantized.w8a8

Image-Text-to-Text

8-bit precision

compressed-tensors

Model card Files Files and versions Community

alexmarques commited on 12 days ago

Commit

f9f848f

·

verified ·

1 Parent(s): 2d9653b

Add vision evals

Files changed (1) hide show

README.md +42 -1

README.md CHANGED Viewed

@@ -334,6 +334,26 @@ Non-coding tasks were evaluated with [lm-evaluation-harness](https://github.com/
     --batch_size auto
   ```
 **Coding**
 The commands below can be used for mbpp by simply replacing the dataset name.
@@ -366,7 +386,6 @@ evalplus.evaluate \
 ### Accuracy
-#### Open LLM Leaderboard evaluation scores
 <table>
   <tr>
    <th>Category
@@ -526,5 +545,27 @@ evalplus.evaluate \
    <td>100.7%
    </td>
   </tr>
 </table>

     --batch_size auto
   ```
+  **MMMU**
+  ```
+  lm_eval \
+    --model vllm \
+    --model_args pretrained="RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-quantized.w8a8",dtype=auto,gpu_memory_utilization=0.9,max_images=8,enable_chunk_prefill=True,tensor_parallel_size=2 \
+    --tasks mmmu \
+    --apply_chat_template\
+    --batch_size auto
+  ```
+  **ChartQA**
+  ```
+  lm_eval \
+    --model vllm \
+    --model_args pretrained="RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-quantized.w8a8",dtype=auto,gpu_memory_utilization=0.9,max_images=8,enable_chunk_prefill=True,tensor_parallel_size=2 \
+    --tasks chartqa \
+    --apply_chat_template\
+    --batch_size auto
+  ```
 **Coding**
 The commands below can be used for mbpp by simply replacing the dataset name.
 ### Accuracy
 <table>
   <tr>
    <th>Category
    <td>100.7%
    </td>
   </tr>
+  <tr>
+   <td rowspan="2" ><strong>Vision</strong>
+   </td>
+   <td>MMMU (0-shot)
+   </td>
+   <td>52.11
+   </td>
+   <td>53.11
+   </td>
+   <td>101.9%
+   </td>
+  </tr>
+  <tr>
+   <td>ChartQA (0-shot)
+   </td>
+   <td>81.36
+   </td>
+   <td>82.36
+   </td>
+   <td>101.2%
+   </td>
+  </tr>
 </table>