Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Misc with no match

Carbon Emissions

Mixture of Experts

Models

4,935

Full-text search

Active filters: dpo, trl

AmberYifan/Qwen2.5-7B-sft-ultrachat-SPIN-Qwen2.5-72B-Instruct

Text Generation • Updated Mar 20 • 6 • 1

mradermacher/Qwen2.5-7B-sft-ultrachat-all-pool-GGUF

Updated Mar 21 • 46 • 1

mradermacher/Qwen2.5-7B-sft-ultrachat-SPIN-Qwen2.5-72B-Instruct-GGUF

Updated Mar 21 • 44 • 1

mradermacher/Qwen2-7B-sft-ultrachat-all-pool-GGUF

Updated Mar 21 • 44 • 1

AmberYifan/Qwen2.5-7B-sft-ultrachat-SPIN-gpt4o

Text Generation • Updated Mar 21 • 9 • 1

mradermacher/Qwen2.5-7B-sft-ultrachat-SPIN-gpt4o-GGUF

Updated Mar 22 • 564 • 1

mradermacher/Llama-3.1-8B-sft-ultrachat-peers-pool-GGUF

Updated Mar 22 • 30 • 1

mradermacher/Llama-3.1-8B-Magpie-Align-v0.2-GGUF

Updated Mar 30 • 247 • 1

mradermacher/Llama-3.1-8B-Magpie-Align-v0.2-i1-GGUF

Updated Mar 30 • 503 • 1

LuyiCui/DeepSeek-R1-Distill-Qwen-1.5B-DPO

Text Generation • Updated 7 days ago • 4 • 1

codelion/Qwen3-0.6B-PTS-DPO

Text Generation • Updated 1 day ago • 18 • 1

lewtun/zephyr-7b-dpo-full

Text Generation • Updated Jan 5, 2024 • 5

alignment-handbook/zephyr-7b-dpo-full

Text Generation • Updated Jan 10, 2024 • 103 • 3

alignment-handbook/zephyr-7b-dpo-qlora

Updated Jan 9, 2024 • 13 • 9

amirabdullah19852020/gpt-neo-125m_hh_reward

Text Generation • Updated Apr 27, 2024 • 3

lewtun/zephyr-7b-dpo-qlora

Updated Jan 9, 2024 • 1

sambar/zephyr-7b-ipo-lora

Text Generation • Updated Jan 5, 2024 • 4

nlee282/moai-dpo-1.0

Updated Jan 5, 2024

nikkoyabut/merged_model_dpo

Updated Jan 5, 2024

sambar/zephyr-7b-ipo-lora-5ep

Text Generation • Updated Jan 6, 2024 • 4

alexredna/TinyLlama-1.1B-Chat-v1.0-reasoning-v2-dpo

Text Generation • Updated Jan 7, 2024 • 9 • 2

AlbelTec/mistral-dpo-old

Updated Jan 7, 2024 • 2

Yaxin1992/mixtral-dpo-1000

Updated Jan 9, 2024 • 2

adhi29/openhermes-mistral-dpo-gptq

Updated Jan 10, 2024

ybelkada/test-tags-model

Text Generation • Updated Jan 9, 2024 • 5

ybelkada/test-tags-model-2

Text Generation • Updated Jan 9, 2024 • 10

justinj92/dpoplatypus-phi2

Text Generation • Updated Jan 10, 2024

Belred/mistral-dpo

Updated Jan 9, 2024

lewtun/zephyr-7b-dpo-qlora-8e0975a

Updated Jan 10, 2024

mecoaoge2/results

Updated Jan 10, 2024