Auto-SLURP: A Benchmark Dataset for Evaluating Multi-Agent Frameworks in Smart Personal Assistant Paper • 2504.18373 • Published 13 days ago • 2
SWE-smith: Scaling Data for Software Engineering Agents Paper • 2504.21798 • Published 8 days ago • 6
Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems Paper • 2505.00212 • Published 8 days ago • 2
VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model Paper • 2505.03739 • Published 2 days ago • 6
Multi-Agent System for Comprehensive Soccer Understanding Paper • 2505.03735 • Published 2 days ago • 14
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference Paper • 2505.02922 • Published 3 days ago • 19
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale Paper • 2505.03005 • Published 3 days ago • 23
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published 2 days ago • 71
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published 2 days ago • 78
Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning Paper • 2505.01441 • Published 10 days ago • 30
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL Paper • 2505.02391 • Published 3 days ago • 21
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction Paper • 2505.02471 • Published 3 days ago • 11
Low-Precision Training of Large Language Models: Methods, Challenges, and Opportunities Paper • 2505.01043 • Published 6 days ago • 9
Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents Paper • 2505.02156 • Published 4 days ago • 17
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning Paper • 2505.02835 • Published 3 days ago • 20
AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization Paper • 2504.21659 • Published 8 days ago • 9