1 9 52

QUANG HUY CHU

cqhofsns

AI & ML interests

Deep Reinforcement Learning --- Natural Language Processing

Recent Activity

liked a model 2 days ago

facebook/PE-Core-G14-448

upvoted an article 19 days ago

Illustrating Reinforcement Learning from Human Feedback (RLHF)

liked a model 27 days ago

google/gemma-3-27b-it

View all activity

Organizations

None yet

cqhofsns's activity

liked a model 2 days ago

facebook/PE-Core-G14-448

Zero-Shot Image Classification • Updated 8 days ago • 14.1k • 13

upvoted an article 19 days ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Dec 9, 2022

• 247

liked a model 27 days ago

google/gemma-3-27b-it

Image-Text-to-Text • Updated Mar 21 • 401k • • 1.33k

liked a model 29 days ago

OpenGVLab/InternVL2_5-78B-MPO

Image-Text-to-Text • Updated Mar 25 • 27.8k • 54

New activity in meta-llama/Llama-4-Scout-17B-16E-Instruct 29 days ago

[Issue report] missing keys in the json files

#45 opened about 1 month ago by

ShervinGhasemlou

liked a dataset 29 days ago

openai/gsm8k

Viewer • Updated Jan 4, 2024 • 17.6k • 548k • 718

liked a model about 1 month ago

meta-llama/Llama-4-Maverick-17B-128E-Instruct

Image-Text-to-Text • Updated 30 days ago • 85.2k • • 321

liked a model about 2 months ago

Qwen/Qwen2.5-32B-Instruct

Text Generation • Updated Sep 25, 2024 • 473k • • 271

commented on DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge 3 months ago

Thank you for this post. Very clear explanation and nice example ;)

upvoted 2 articles 3 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 129

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 852

liked a model 3 months ago

mistralai/Mistral-7B-Instruct-v0.3

Text Generation • Updated Aug 21, 2024 • 783k • • 1.68k

upvoted a collection 3 months ago

DeepSeek-R1

Collection

8 items • Updated Jan 21 • 627

liked 2 models 3 months ago

answerdotai/ModernBERT-base

Fill-Mask • Updated Jan 15 • 467k • 841

nlpaueb/legal-bert-small-uncased

Fill-Mask • Updated Apr 28, 2022 • 6.94k • 21

liked 2 models 4 months ago

meta-llama/Meta-Llama-3-70B-Instruct

Text Generation • Updated Dec 15, 2024 • 213k • • 1.47k

google/gemma-2-9b-it

Text Generation • Updated Aug 27, 2024 • 355k • • 708

upvoted 3 collections 4 months ago