Knowledge Augmented Complex Problem Solving with Large Language Models: A Survey Paper • 2505.03418 • Published 2 days ago • 4
OpenHelix: A Short Survey, Empirical Analysis, and Open-Source Dual-System VLA Model for Robotic Manipulation Paper • 2505.03912 • Published 2 days ago • 6
Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving Paper • 2505.04528 • Published 1 day ago • 7
PrimitiveAnything: Human-Crafted 3D Primitive Assembly Generation with Auto-Regressive Transformer Paper • 2505.04622 • Published 1 day ago • 13
Beyond Recognition: Evaluating Visual Perspective Taking in Vision Language Models Paper • 2505.03821 • Published 6 days ago • 18
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation Paper • 2505.04512 • Published 1 day ago • 13
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities Paper • 2505.02567 • Published 3 days ago • 50
Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems Paper • 2505.00212 • Published 8 days ago • 2
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation Paper • 2505.02836 • Published 3 days ago • 6
SWE-smith: Scaling Data for Software Engineering Agents Paper • 2504.21798 • Published 8 days ago • 6
ZeroSearch: Incentivize the Search Capability of LLMs without Searching Paper • 2505.04588 • Published 1 day ago • 28
Attention Mechanisms Perspective: Exploring LLM Processing of Graph-Structured Data Paper • 2505.02130 • Published 4 days ago • 3
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing Paper • 2505.02823 • Published 3 days ago • 4
HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation Paper • 2504.21650 • Published 8 days ago • 11
VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model Paper • 2505.03739 • Published 2 days ago • 6
InfoVids: Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships Paper • 2505.03164 • Published 3 days ago • 5
Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering Paper • 2505.02311 • Published 4 days ago • 2
Multi-Agent System for Comprehensive Soccer Understanding Paper • 2505.03735 • Published 2 days ago • 15
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference Paper • 2505.02922 • Published 3 days ago • 21