Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Nan Jiang
nanjiang
Follow
RogerZhuo's profile picture
1 follower
ยท
0 following
AI & ML interests
None yet
Recent Activity
authored
a paper
1 day ago
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
authored
a paper
2 months ago
Self-rewarding correction for mathematical reasoning
authored
a paper
12 months ago
RLHF Workflow: From Reward Modeling to Online RLHF
View all activity
Organizations
None yet
Papers
6
arxiv:
2505.02391
arxiv:
2502.19613
arxiv:
2405.07863
arxiv:
2401.09681
Expand 6 papers
models
0
None public yet
datasets
0
None public yet