DogeRM - a miulab Collection

miulab 's Collections

DogeRM

DogeRM

updated Oct 8, 2024

Models trained/used in the paper "DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging ( https://arxiv.org/abs/2407.01470)

miulab/llama2-7b-oss-instruct

Text Generation • Updated Oct 3, 2024 • 2
miulab/llama2-7b-alpaca-sft-10k

Text Generation • Updated Oct 3, 2024 • 4
miulab/llama2-7b-magicoder-evol-instruct

Text Generation • Updated Oct 3, 2024 • 2
miulab/llama2-7b-ultrafeedback-rm

Text Classification • Updated Oct 3, 2024 • 1
TIGER-Lab/MAmmoTH-7B

Text Generation • Updated Dec 5, 2023 • 148 • 8
meta-math/MetaMath-7B-V1.0

Text Generation • Updated Dec 21, 2023 • 359 • 27
Ray2333/reward-model-Mistral-7B-instruct-Unified-Feedback

Text Classification • Updated Feb 5 • 1.24k • 11
TIGER-Lab/MAmmoTH2-7B-Plus

Text Generation • Updated Nov 26, 2024 • 5.43k • 7