ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published Mar 14 • 140
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens Paper • 2501.07730 • Published Jan 13 • 17
CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models Paper • 2412.10117 • Published Dec 13, 2024 • 3
UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing Paper • 2402.13185 • Published Feb 20, 2024
SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints Paper • 2412.07760 • Published Dec 10, 2024 • 56
UniHDA: Towards Universal Hybrid Domain Adaptation of Image Generators Paper • 2401.12596 • Published Jan 23, 2024
Searching Priors Makes Text-to-Video Synthesis Better Paper • 2406.03215 • Published Jun 5, 2024 • 14
LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models Paper • 2403.11627 • Published Mar 18, 2024
PersonalVideo: High ID-Fidelity Video Customization without Dynamic and Semantic Degradation Paper • 2411.17048 • Published Nov 26, 2024
B4: Towards Optimal Assessment of Plausible Code Solutions with Plausible Tests Paper • 2409.08692 • Published Sep 13, 2024 • 28
VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters Paper • 2408.17253 • Published Aug 30, 2024 • 40
Entangled View-Epipolar Information Aggregation for Generalizable Neural Radiance Fields Paper • 2311.11845 • Published Nov 20, 2023 • 1
PLA4D: Pixel-Level Alignments for Text-to-4D Gaussian Splatting Paper • 2405.19957 • Published May 30, 2024 • 10
TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT Paper • 2307.08674 • Published Jul 17, 2023 • 48
Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts Paper • 2307.07218 • Published Jul 14, 2023 • 27
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias Paper • 2306.03509 • Published Jun 6, 2023 • 5