siyeng feng

siyengfeng

AI & ML interests

None yet

Recent Activity

liked a model about 18 hours ago

SWE-bench/SWE-agent-LM-32B

liked a model about 22 hours ago

XiaomiMiMo/MiMo-7B-RL

liked a model about 22 hours ago

microsoft/Phi-4-reasoning-plus

View all activity

Organizations

None yet

siyengfeng's activity

upvoted 5 papers 3 days ago

Skill Discovery for Software Scripting Automation via Offline Simulations with LLMs

Paper • 2504.20406 • Published 9 days ago • 6

AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization

Paper • 2504.21659 • Published 8 days ago • 9

LLMs for Engineering: Teaching Models to Design High Powered Rockets

Paper • 2504.19394 • Published 11 days ago • 12

Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks

Paper • 2505.00234 • Published 7 days ago • 21

DeepCritic: Deliberate Critique with Large Language Models

Paper • 2505.00662 • Published 7 days ago • 48

upvoted a paper 6 days ago

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Paper • 2504.21776 • Published 8 days ago • 41

upvoted a collection 10 days ago

OpenMathReasoning

Collection

Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset" • 7 items • Updated 3 days ago • 35

upvoted 13 papers 11 days ago

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

Paper • 2504.15279 • Published 17 days ago • 73

BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation

Paper • 2504.14538 • Published 18 days ago • 27

Learning Adaptive Parallel Reasoning with Language Models

Paper • 2504.15466 • Published 17 days ago • 42

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published 16 days ago • 102

Causal-Copilot: An Autonomous Causal Analysis Agent

Paper • 2504.13263 • Published 21 days ago • 6

CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation

Paper • 2504.15254 • Published 17 days ago • 6

Rethinking the Generation of High-Quality CoT Data from the Perspective of LLM-Adaptive Question Difficulty Grading

Paper • 2504.11919 • Published 22 days ago • 12

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

Paper • 2504.15585 • Published 16 days ago • 13

AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset

Paper • 2504.16891 • Published 15 days ago • 18

Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model

Paper • 2504.15843 • Published 16 days ago • 18