V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper โข 2504.06148 โข Published 30 days ago โข 13 โข 2
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper โข 2503.20198 โข Published Mar 26 โข 4 โข 3
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation Paper โข 2503.20672 โข Published Mar 26 โข 14 โข 3
TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation Paper โข 2502.07870 โข Published Feb 11 โข 45 โข 2