slavashe commited on
Commit
f11347d
·
1 Parent(s): dc73522

update README

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md CHANGED
@@ -1,3 +1,64 @@
1
  ---
2
  license: cdla-permissive-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cdla-permissive-2.0
3
  ---
4
+
5
+ ## Model Summary
6
+ [DAC auto-encoder models](https://github.com/descriptinc/descript-audio-codec) provide compact discrete tokenization of speech and audio signals that facilitate signal generation by cascaded generative AI models (e.g. multi-modal generative AI models) and high-quality reconstruction of the original signals. [The current models](https://www.isca-archive.org/interspeech_2024/shechtman24_interspeech.pdf) improve upon the [original DAC models](https://github.com/descriptinc/descript-audio-codec) by allowing a more compact representation for speech-only signals with high-quality signal reconstruction.
7
+
8
+ ## Usage
9
+ follow [DAC](https://github.com/descriptinc/descript-audio-codec) installation instructions
10
+ download the model weights from the current repo (e.g., *weights_24khz_1.5kbps_v1.0*)
11
+ ### Compress audio
12
+ ```
13
+ python3 -m dac encode /path/to/input --output /path/to/output/codes --weights_path /path/to/weights_24khz_1.5kbps_v1.0
14
+ ```
15
+
16
+ This command will create `.dac` files with the same name as the input files. It will also preserve the directory structure relative to input root and re-create it in the output directory. Please use `python -m dac encode --help` for more options.
17
+
18
+ ### Reconstruct audio from compressed codes
19
+ ```
20
+ python3 -m dac decode /path/to/output/codes --output /path/to/reconstructed_input --weights_path /path/to/weights_24khz_1.5kbps_v1.0
21
+ ```
22
+
23
+ This command will create `.wav` files with the same name as the input files. It will also preserve the directory structure relative to input root and re-create it in the output directory. Please use `python -m dac decode --help` for more options.
24
+
25
+ ### Programmatic Usage
26
+ ```py
27
+ import dac
28
+ from audiotools import AudioSignal
29
+
30
+ # Download a model
31
+ model_path = /path/to/weights_24khz_1.5kbps_v1.0
32
+ model = dac.DAC.load(model_path)
33
+
34
+ model.to('cuda')
35
+
36
+ # Load audio signal file
37
+ signal = AudioSignal('input.wav')
38
+
39
+ # Encode audio signal as one long file
40
+ # (may run out of GPU memory on long files)
41
+ signal.to(model.device)
42
+
43
+ x = model.preprocess(signal.audio_data, signal.sample_rate)
44
+ z, codes, latents, _, _ = model.encode(x)
45
+
46
+ # Decode audio signal
47
+ y = model.decode(z)
48
+
49
+ # Alternatively, use the `compress` and `decompress` functions
50
+ # to compress long files.
51
+
52
+ signal = signal.cpu()
53
+ x = model.compress(signal)
54
+
55
+ # Save and load to and from disk
56
+ x.save("compressed.dac")
57
+ x = dac.DACFile.load("compressed.dac")
58
+
59
+ # Decompress it back to an AudioSignal
60
+ y = model.decompress(x)
61
+
62
+ # Write to file
63
+ y.write('output.wav')
64
+ ```