VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge Paper ⢠2504.10342 ⢠Published 24 days ago ⢠11
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper ⢠2504.07096 ⢠Published 29 days ago ⢠73
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper ⢠2504.06263 ⢠Published 30 days ago ⢠159
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages Paper ⢠2503.23542 ⢠Published Mar 30 ⢠10
Large Language Model Agent: A Survey on Methodology, Applications and Challenges Paper ⢠2503.21460 ⢠Published Mar 27 ⢠77
Deceptive Humor: A Synthetic Multilingual Benchmark Dataset for Bridging Fabricated Claims with Humorous Content Paper ⢠2503.16031 ⢠Published Mar 20 ⢠3
Running on Zero 931 931 InfiniteYou-FLUX šø Flexible Photo Recrafting While Preserving Your Identity
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait Paper ⢠2503.12963 ⢠Published Mar 17 ⢠7
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM Paper ⢠2503.14478 ⢠Published Mar 18 ⢠47
API Agents vs. GUI Agents: Divergence and Convergence Paper ⢠2503.11069 ⢠Published Mar 14 ⢠37