Improving Editability in Image Generation with Layer-wise Memory Paper • 2505.01079 • Published 6 days ago • 25
PixelHacker: Image Inpainting with Structural and Semantic Consistency Paper • 2504.20438 • Published 9 days ago • 38
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing Paper • 2505.02370 • Published 3 days ago • 11
SkillMimic-V2: Learning Robust and Generalizable Interaction Skills from Sparse and Noisy Demonstrations Paper • 2505.02094 • Published 4 days ago • 16
LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis Paper • 2505.02625 • Published 3 days ago • 16
Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents Paper • 2505.02156 • Published 4 days ago • 17
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning Paper • 2505.02835 • Published 3 days ago • 20
Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning Paper • 2505.01441 • Published 10 days ago • 30
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play Paper • 2505.02707 • Published 3 days ago • 69
HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation Paper • 2504.21650 • Published 8 days ago • 10
Geospatial Mechanistic Interpretability of Large Language Models Paper • 2505.03368 • Published 2 days ago • 8
Multi-Agent System for Comprehensive Soccer Understanding Paper • 2505.03735 • Published 2 days ago • 15
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published 2 days ago • 73
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published 2 days ago • 78
MediAug: Exploring Visual Augmentation in Medical Imaging Paper • 2504.18983 • Published 12 days ago • 6
UniBiomed: A Universal Foundation Model for Grounded Biomedical Image Interpretation Paper • 2504.21336 • Published 8 days ago • 4
Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions Paper • 2504.19056 • Published 12 days ago • 14
WebThinker: Empowering Large Reasoning Models with Deep Research Capability Paper • 2504.21776 • Published 8 days ago • 41
YoChameleon: Personalized Vision and Language Generation Paper • 2504.20998 • Published 9 days ago • 11