-
coqui/XTTS-v2
Text-to-Speech • Updated • 1.61M • 2.65k -
deepseek-ai/DeepSeek-V3-0324
Text Generation • Updated • 366k • • 2.86k -
openai/whisper-large-v3
Automatic Speech Recognition • Updated • 7.12M • • 4.36k -
Distilling an End-to-End Voice Assistant Without Instruction Training Data
Paper • 2410.02678 • Published • 23
Collections
Discover the best community collections!
Collections including paper arxiv:2302.05543
-
Adding Conditional Control to Text-to-Image Diffusion Models
Paper • 2302.05543 • Published • 52 -
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Paper • 2308.06721 • Published • 30 -
High-Resolution Image Synthesis with Latent Diffusion Models
Paper • 2112.10752 • Published • 13
-
Adding Conditional Control to Text-to-Image Diffusion Models
Paper • 2302.05543 • Published • 52 -
Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention
Paper • 2408.00760 • Published • 7 -
MagicQuill: An Intelligent Interactive Image Editing System
Paper • 2411.09703 • Published • 77 -
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion
Paper • 2403.06976 • Published • 2