Convert vocals to match reference audio
All paper summaries read by Merve
Generate descriptions by uploading images or videos
Generate descriptions for images using text prompts