Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
HuggingFaceH4 's Collections
Scaling Test-Time Compute with Open Models
Zephyr ORPO
Zephyr 7B
Zephyr 7B Gemma
StarChat2 15B
Journal Club
Papers We've Read
Awesome SFT datasets
Awesome feedback datasets
Awesome reward models

Zephyr ORPO

updated Apr 12, 2024

Models and datasets to align LLMs with Odds Ratio Preference Optimisation (ORPO). Recipes here: https://github.com/huggingface/alignment-handbook

Upvote
17

  • ORPO: Monolithic Preference Optimization without Reference Model

    Paper • 2403.07691 • Published Mar 12, 2024 • 65

  • HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1

    Text Generation • Updated Apr 18, 2024 • 95 • 267

  • argilla/distilabel-capybara-dpo-7k-binarized

    Viewer • Updated Jul 16, 2024 • 7.56k • 1.64k • 180
Upvote
17
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs