File size: 1,914 Bytes
75a689a
 
 
 
 
 
8ec2d7c
75a689a
5006a2f
8ec2d7c
 
75a689a
 
 
 
 
 
 
 
 
6346e8b
75a689a
 
 
 
 
 
 
 
 
8ec2d7c
 
 
 
 
 
 
75a689a
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
---
datasets:
- natolambert/skywork-preferences-80k-v0.1-cleaned
- allenai/preference-test-sets
---

# MetaMetrics-RM-v1.0 (ICLR 2025)

+ **Authors** [Genta Indra Winata](https://gentawinata.com/), [David Anugraha](https://davidanugraha.github.io/), [Lucky Susanto](https://luckysusanto.github.io/), [Garry Kuwanto](https://gkuwanto.github.io/), [Derry Tanti Wijaya](https://derrywijaya.github.io/)
+ **Arxiv Paper**: https://arxiv.org/abs/2410.02381
+ **ICLR Paper**: https://openreview.net/forum?id=slO3xTt4CG
+ **Model**: [meta-metrics/MetaMetrics-RM-v1.0](https://huggingface.co/meta-metrics/MetaMetrics-RM-v1.0)
+ **Dataset**:
  - [natolambert/skywork-preferences-80k-v0.1-cleaned](https://huggingface.co/datasets/natolambert/skywork-preferences-80k-v0.1-cleaned)
  - [allenai/preference-test-sets](https://huggingface.co/datasets/allenai/preference-test-sets)
+ **Code Repository:** https://github.com/meta-metrics/metametrics

## RewardBench LeaderBoard

 | Model  | Score | Chat | Chat Hard | Safety | Reasoning |
 |:-------|:------|:-----|:----------|:-------|:----------|
  | nvidia/Llama-3.1-Nemotron-70B-Reward     | **94.1** | 97.5 | 85.7 | **95.1** | 98.1 |
  | meta-metrics/MetaMetrics-RM-v1.0         | 93.5 | **98.9** | 86.2 | 90.7 | **98.2** |
  | SF-Foundation/TextEval-Llama3.1-70B      | 93.5 | 94.1 | **90.1**  | 93.2 | 96.4 |
  | RLHFlow/ArmoRM-Llama3-8B-v0.1            | 90.4 | 96.9 | 76.8 | 90.5 | 97.3 |

## Citation

If you find this work useful for your research, please consider citing:
```
@inproceedings{
  winata2025metametrics,
  title={MetaMetrics: Calibrating Metrics for Generation Tasks Using Human Preferences},
  author={Genta Indra Winata and David Anugraha and Lucky Susanto and Garry Kuwanto and Derry Tanti Wijaya},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2025},
  url={https://openreview.net/forum?id=slO3xTt4CG}
}
```