Search-R1 - a PeterJinGo Collection

Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

PeterJinGo 's Collections

Search-R1

updated Apr 7

Preliminary checkpoints with outcome-only RL.

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 30
PeterJinGo/SearchR1-nq_hotpotqa_train-llama3.2-3b-em-ppo

Updated Mar 12 • 212
PeterJinGo/SearchR1-nq_hotpotqa_train-llama3.2-3b-em-grpo

Updated Mar 12 • 2
PeterJinGo/SearchR1-nq_hotpotqa_train-llama3.2-3b-it-em-ppo

Updated Mar 12 • 14
PeterJinGo/SearchR1-nq_hotpotqa_train-llama3.2-3b-it-em-grpo

Updated Mar 12 • 10
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-ppo

Updated Mar 12 • 136
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-grpo

Updated Mar 12 • 26
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-it-em-ppo

Updated Mar 12 • 23
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-it-em-grpo

Updated Mar 12 • 35
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-em-ppo

Updated Mar 21 • 4.08k
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-it-em-ppo

Updated Mar 12 • 375
PeterJinGo/wiki-18-corpus

Updated Feb 26 • 1.76k
PeterJinGo/wiki-18-e5-index

Updated Feb 26 • 2.4k
PeterJinGo/nq_hotpotqa_train

Viewer • Updated Mar 13 • 221k • 422 • 2

Collection guide
Browse collections

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs