OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents Paper • 2505.03570 • Published 2 days ago • 4
OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution Paper • 2505.04606 • Published 1 day ago • 5
OpenHelix: A Short Survey, Empirical Analysis, and Open-Source Dual-System VLA Model for Robotic Manipulation Paper • 2505.03912 • Published 2 days ago • 6
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation Paper • 2505.04512 • Published 1 day ago • 13
PrimitiveAnything: Human-Crafted 3D Primitive Assembly Generation with Auto-Regressive Transformer Paper • 2505.04622 • Published 1 day ago • 13
ZeroSearch: Incentivize the Search Capability of LLMs without Searching Paper • 2505.04588 • Published 1 day ago • 28