Submitted by akhaliq 65 Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context · 671 authors 5
Submitted by akhaliq 46 ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment · 6 authors 2
Submitted by akhaliq 26 Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks · 14 authors 1
Submitted by akhaliq 25 CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion · 9 authors 3
Submitted by akhaliq 23 CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model · 9 authors 2
Submitted by akhaliq 22 VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models · 8 authors 1