naonauno's picture
Upload 855 files
d66c48f verified

A newer version of the Gradio SDK is available: 5.29.0

Upgrade

Amphion Singing Voice Conversion (VC) Recipe

Quick Start

We provide a beginner recipe to demonstrate how to train a cutting edge VC model. Specifically, it is an official implementation of the paper "NORO: A Noise-Robust One-Shot Voice Conversion System with Hidden Speaker Representation Capabilities".

Supported Model Architectures

Until now, Amphion has supported a noise-robust VC model with the following architecture:



It has the following features:

  1. Noise-Robust Voice Conversion: Utilizes a dual-branch reference encoding module and noise-agnostic contrastive speaker loss to maintain high-quality voice conversion in noisy environments.
  2. One-shot Voice Conversion: Achieves timbre conversion using only one reference speech sample.
  3. Speaker Representation Learning: Explores the potential of the reference encoder as a self-supervised speaker encoder.