PixelHacker: Image Inpainting with Structural and Semantic Consistency Paper • 2504.20438 • Published 9 days ago • 38
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer Paper • 2504.20690 • Published 9 days ago • 17
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models Paper • 2504.15271 • Published 17 days ago • 65
Personalized Text-to-Image Generation with Auto-Regressive Models Paper • 2504.13162 • Published 21 days ago • 19
Describe Anything: Detailed Localized Image and Video Captioning Paper • 2504.16072 • Published 16 days ago • 60
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning Paper • 2504.14509 • Published 18 days ago • 50
Step1X-Edit: A Practical Framework for General Image Editing Paper • 2504.17761 • Published 14 days ago • 86
Subject-driven Video Generation via Disentangled Identity and Motion Paper • 2504.17816 • Published 15 days ago • 11
InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework Paper • 2504.12395 • Published 22 days ago • 17
Cobra: Efficient Line Art COlorization with BRoAder References Paper • 2504.12240 • Published 22 days ago • 27
FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation Paper • 2504.07405 • Published 28 days ago • 12
Compass Control: Multi Object Orientation Control for Text-to-Image Generation Paper • 2504.06752 • Published 29 days ago • 10
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning Paper • 2504.07960 • Published 28 days ago • 46
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model Paper • 2504.07615 • Published 28 days ago • 31