39 86 98

Somshubra Majumdar

smajumdar94

AI & ML interests

None yet

Recent Activity

published a model about 22 hours ago

nvidia/OpenCodeReasoning-Nemotron-32B-IOI

published a model about 22 hours ago

nvidia/OpenCodeReasoning-Nemotron-32B

published a model about 22 hours ago

nvidia/OpenCodeReasoning-Nemotron-14B

View all activity

Organizations

smajumdar94's activity

upvoted 2 collections 10 days ago

OpenMathReasoning

Collection

Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset" • 7 items • Updated 3 days ago • 35

OpenCodeReasoning

Collection

Reasoning data for supervised finetuning of LLMs to advance data distillation for competitive coding • 7 items • Updated 1 day ago • 15

upvoted an article 13 days ago

Article

Tiny Agents: a MCP-powered agent in 50 lines of code

14 days ago

• 221

upvoted a paper 21 days ago

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published 24 days ago • 84

upvoted an article 22 days ago

Article

Introducing smolagents: simple agents that write actions in code.

Dec 31, 2024

• 1.01k

upvoted 6 papers about 1 month ago

Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published Apr 3 • 54

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 62

upvoted a paper about 2 months ago

Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't

Paper • 2503.16219 • Published Mar 20 • 48

upvoted an article about 2 months ago

Article

NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets

Mar 18

• 35

upvoted a paper about 2 months ago

Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond

Paper • 2503.10460 • Published Mar 13 • 28

upvoted 3 papers 3 months ago

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published Feb 17 • 37

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20, 2024 • 87

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Paper • 2502.09604 • Published Feb 13 • 36

upvoted an article 3 months ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.24k

upvoted a paper 3 months ago

Kolmogorov-Arnold Transformer

Paper • 2409.10594 • Published Sep 16, 2024 • 46