-
-
-
-
-
-
Inference Providers
Active filters:
dpo, trl
AmberYifan/Qwen2.5-7B-sft-ultrachat-SPIN-Qwen2.5-72B-Instruct
Text Generation
•
Updated
•
6
•
1
mradermacher/Qwen2.5-7B-sft-ultrachat-all-pool-GGUF
mradermacher/Qwen2.5-7B-sft-ultrachat-SPIN-Qwen2.5-72B-Instruct-GGUF
mradermacher/Qwen2-7B-sft-ultrachat-all-pool-GGUF
AmberYifan/Qwen2.5-7B-sft-ultrachat-SPIN-gpt4o
Text Generation
•
Updated
•
9
•
1
mradermacher/Qwen2.5-7B-sft-ultrachat-SPIN-gpt4o-GGUF
mradermacher/Llama-3.1-8B-sft-ultrachat-peers-pool-GGUF
mradermacher/Llama-3.1-8B-Magpie-Align-v0.2-GGUF
mradermacher/Llama-3.1-8B-Magpie-Align-v0.2-i1-GGUF
LuyiCui/DeepSeek-R1-Distill-Qwen-1.5B-DPO
Text Generation
•
Updated
•
4
•
1
codelion/Qwen3-0.6B-PTS-DPO
Text Generation
•
Updated
•
18
•
1
lewtun/zephyr-7b-dpo-full
Text Generation
•
Updated
•
5
alignment-handbook/zephyr-7b-dpo-full
Text Generation
•
Updated
•
103
•
3
alignment-handbook/zephyr-7b-dpo-qlora
Updated
•
13
•
9
amirabdullah19852020/gpt-neo-125m_hh_reward
Text Generation
•
Updated
•
3
lewtun/zephyr-7b-dpo-qlora
sambar/zephyr-7b-ipo-lora
Text Generation
•
Updated
•
4
nlee282/moai-dpo-1.0
Updated
nikkoyabut/merged_model_dpo
Updated
sambar/zephyr-7b-ipo-lora-5ep
Text Generation
•
Updated
•
4
alexredna/TinyLlama-1.1B-Chat-v1.0-reasoning-v2-dpo
Text Generation
•
Updated
•
9
•
2
AlbelTec/mistral-dpo-old
Yaxin1992/mixtral-dpo-1000
adhi29/openhermes-mistral-dpo-gptq
Updated
ybelkada/test-tags-model
Text Generation
•
Updated
•
5
ybelkada/test-tags-model-2
Text Generation
•
Updated
•
10
justinj92/dpoplatypus-phi2
Text Generation
•
Updated
Belred/mistral-dpo
Updated
lewtun/zephyr-7b-dpo-qlora-8e0975a
Updated
mecoaoge2/results
Updated