OliveiraJLT leaderboard-pr-bot commited on
Commit
637c43e
·
verified ·
1 Parent(s): 3a28835

Adding Evaluation Results (#2)

Browse files

- Adding Evaluation Results (21c294477b1e5f5c528cd3b6f3149885fae673a3)


Co-authored-by: Open LLM Leaderboard PR Bot <[email protected]>

Files changed (1) hide show
  1. README.md +106 -0
README.md CHANGED
@@ -156,6 +156,98 @@ model-index:
156
  source:
157
  url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=OliveiraJLT/Sagui-7B-Instruct-v0.1
158
  name: Open Portuguese LLM Leaderboard
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
159
  ---
160
 
161
  # Sagui-7B-Instruct-v0.1
@@ -271,3 +363,17 @@ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-
271
  |PT Hate Speech Binary | 30.38|
272
  |tweetSentBR | 18.34|
273
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
156
  source:
157
  url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=OliveiraJLT/Sagui-7B-Instruct-v0.1
158
  name: Open Portuguese LLM Leaderboard
159
+ - task:
160
+ type: text-generation
161
+ name: Text Generation
162
+ dataset:
163
+ name: IFEval (0-Shot)
164
+ type: HuggingFaceH4/ifeval
165
+ args:
166
+ num_few_shot: 0
167
+ metrics:
168
+ - type: inst_level_strict_acc and prompt_level_strict_acc
169
+ value: 28.92
170
+ name: strict accuracy
171
+ source:
172
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=OliveiraJLT/Sagui-7B-Instruct-v0.1
173
+ name: Open LLM Leaderboard
174
+ - task:
175
+ type: text-generation
176
+ name: Text Generation
177
+ dataset:
178
+ name: BBH (3-Shot)
179
+ type: BBH
180
+ args:
181
+ num_few_shot: 3
182
+ metrics:
183
+ - type: acc_norm
184
+ value: 5.04
185
+ name: normalized accuracy
186
+ source:
187
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=OliveiraJLT/Sagui-7B-Instruct-v0.1
188
+ name: Open LLM Leaderboard
189
+ - task:
190
+ type: text-generation
191
+ name: Text Generation
192
+ dataset:
193
+ name: MATH Lvl 5 (4-Shot)
194
+ type: hendrycks/competition_math
195
+ args:
196
+ num_few_shot: 4
197
+ metrics:
198
+ - type: exact_match
199
+ value: 0.38
200
+ name: exact match
201
+ source:
202
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=OliveiraJLT/Sagui-7B-Instruct-v0.1
203
+ name: Open LLM Leaderboard
204
+ - task:
205
+ type: text-generation
206
+ name: Text Generation
207
+ dataset:
208
+ name: GPQA (0-shot)
209
+ type: Idavidrein/gpqa
210
+ args:
211
+ num_few_shot: 0
212
+ metrics:
213
+ - type: acc_norm
214
+ value: 0.0
215
+ name: acc_norm
216
+ source:
217
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=OliveiraJLT/Sagui-7B-Instruct-v0.1
218
+ name: Open LLM Leaderboard
219
+ - task:
220
+ type: text-generation
221
+ name: Text Generation
222
+ dataset:
223
+ name: MuSR (0-shot)
224
+ type: TAUR-Lab/MuSR
225
+ args:
226
+ num_few_shot: 0
227
+ metrics:
228
+ - type: acc_norm
229
+ value: 10.61
230
+ name: acc_norm
231
+ source:
232
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=OliveiraJLT/Sagui-7B-Instruct-v0.1
233
+ name: Open LLM Leaderboard
234
+ - task:
235
+ type: text-generation
236
+ name: Text Generation
237
+ dataset:
238
+ name: MMLU-PRO (5-shot)
239
+ type: TIGER-Lab/MMLU-Pro
240
+ config: main
241
+ split: test
242
+ args:
243
+ num_few_shot: 5
244
+ metrics:
245
+ - type: acc
246
+ value: 5.39
247
+ name: accuracy
248
+ source:
249
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=OliveiraJLT/Sagui-7B-Instruct-v0.1
250
+ name: Open LLM Leaderboard
251
  ---
252
 
253
  # Sagui-7B-Instruct-v0.1
 
363
  |PT Hate Speech Binary | 30.38|
364
  |tweetSentBR | 18.34|
365
 
366
+
367
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
368
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_OliveiraJLT__Sagui-7B-Instruct-v0.1)
369
+
370
+ | Metric |Value|
371
+ |-------------------|----:|
372
+ |Avg. | 8.39|
373
+ |IFEval (0-Shot) |28.92|
374
+ |BBH (3-Shot) | 5.04|
375
+ |MATH Lvl 5 (4-Shot)| 0.38|
376
+ |GPQA (0-shot) | 0.00|
377
+ |MuSR (0-shot) |10.61|
378
+ |MMLU-PRO (5-shot) | 5.39|
379
+