Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
rrjin 's Collections
LLMs Align

LLMs Align

updated Aug 19, 2024
Upvote
-

  • Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking

    Paper • 2312.09244 • Published Dec 14, 2023 • 11
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs