Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Trangle 's Collections
LLM-Model
llama2-series
LLM Prompts
Video Generate
RLHF
AIGC-Image
LLMs-Inference
Multimodal
LLM-Agent
LLM-Data
World Model
Inference
LLM-Reward
LLM-APP-Recommendation
Voice-ASR
Data-Synthesis

RLHF

updated Jun 7, 2024
Upvote
-

  • Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

    Paper • 2401.08417 • Published Jan 16, 2024 • 37

  • PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs

    Paper • 2406.02886 • Published Jun 5, 2024 • 11
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs