samarth-ht commited on
Commit
9b210a9
Β·
1 Parent(s): a1f5dc0
This view is limited to 50 files because it contains too many changes. Β  See raw diff
Files changed (50) hide show
  1. LICENSE +0 -201
  2. app.py +1 -2
  3. data_processing_pipeline.sh +0 -9
  4. eval/eval_sync_conf.py +1 -1
  5. eval/eval_syncnet_acc.py +2 -2
  6. inference.sh +0 -9
  7. latentsync/utils/mask.png +0 -0
  8. preprocess/affine_transform.py +2 -2
  9. preprocess/filter_high_resolution.py +1 -1
  10. preprocess/remove_broken_videos.py +2 -2
  11. preprocess/remove_incorrect_affined.py +1 -1
  12. scripts/inference.py +3 -3
  13. scripts/train_syncnet.py +5 -5
  14. scripts/train_unet.py +8 -8
  15. setup_env.sh +0 -23
  16. {latentsync β†’ soundimage}/data/syncnet_dataset.py +0 -0
  17. {latentsync β†’ soundimage}/data/unet_dataset.py +0 -0
  18. {latentsync β†’ soundimage}/models/attention.py +0 -0
  19. {latentsync β†’ soundimage}/models/motion_module.py +0 -0
  20. {latentsync β†’ soundimage}/models/resnet.py +0 -0
  21. {latentsync β†’ soundimage}/models/syncnet.py +0 -0
  22. {latentsync β†’ soundimage}/models/syncnet_wav2lip.py +0 -0
  23. {latentsync β†’ soundimage}/models/unet.py +0 -0
  24. {latentsync β†’ soundimage}/models/unet_blocks.py +0 -0
  25. {latentsync β†’ soundimage}/models/utils.py +0 -0
  26. {latentsync β†’ soundimage}/pipelines/lipsync_pipeline.py +0 -0
  27. {latentsync β†’ soundimage}/trepa/__init__.py +0 -0
  28. {latentsync β†’ soundimage}/trepa/third_party/VideoMAEv2/__init__.py +0 -0
  29. {latentsync β†’ soundimage}/trepa/third_party/VideoMAEv2/utils.py +0 -0
  30. {latentsync β†’ soundimage}/trepa/third_party/VideoMAEv2/videomaev2_finetune.py +0 -0
  31. {latentsync β†’ soundimage}/trepa/third_party/VideoMAEv2/videomaev2_pretrain.py +0 -0
  32. {latentsync β†’ soundimage}/trepa/third_party/__init__.py +0 -0
  33. {latentsync β†’ soundimage}/trepa/utils/__init__.py +0 -0
  34. {latentsync β†’ soundimage}/trepa/utils/data_utils.py +0 -0
  35. {latentsync β†’ soundimage}/trepa/utils/metric_utils.py +0 -0
  36. {latentsync β†’ soundimage}/utils/affine_transform.py +0 -0
  37. {latentsync β†’ soundimage}/utils/audio.py +0 -0
  38. {latentsync β†’ soundimage}/utils/av_reader.py +0 -0
  39. {latentsync β†’ soundimage}/utils/image_processor.py +0 -0
  40. {latentsync β†’ soundimage}/utils/util.py +0 -0
  41. {latentsync β†’ soundimage}/whisper/audio2feature.py +0 -0
  42. {latentsync β†’ soundimage}/whisper/whisper/__init__.py +0 -0
  43. {latentsync β†’ soundimage}/whisper/whisper/__main__.py +0 -0
  44. {latentsync β†’ soundimage}/whisper/whisper/assets/gpt2/merges.txt +0 -0
  45. {latentsync β†’ soundimage}/whisper/whisper/assets/gpt2/special_tokens_map.json +0 -0
  46. {latentsync β†’ soundimage}/whisper/whisper/assets/gpt2/tokenizer_config.json +0 -0
  47. {latentsync β†’ soundimage}/whisper/whisper/assets/gpt2/vocab.json +0 -0
  48. {latentsync β†’ soundimage}/whisper/whisper/assets/mel_filters.npz +0 -0
  49. {latentsync β†’ soundimage}/whisper/whisper/assets/multilingual/added_tokens.json +0 -0
  50. {latentsync β†’ soundimage}/whisper/whisper/assets/multilingual/merges.txt +0 -0
LICENSE DELETED
@@ -1,201 +0,0 @@
1
- Apache License
2
- Version 2.0, January 2004
3
- http://www.apache.org/licenses/
4
-
5
- TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
-
7
- 1. Definitions.
8
-
9
- "License" shall mean the terms and conditions for use, reproduction,
10
- and distribution as defined by Sections 1 through 9 of this document.
11
-
12
- "Licensor" shall mean the copyright owner or entity authorized by
13
- the copyright owner that is granting the License.
14
-
15
- "Legal Entity" shall mean the union of the acting entity and all
16
- other entities that control, are controlled by, or are under common
17
- control with that entity. For the purposes of this definition,
18
- "control" means (i) the power, direct or indirect, to cause the
19
- direction or management of such entity, whether by contract or
20
- otherwise, or (ii) ownership of fifty percent (50%) or more of the
21
- outstanding shares, or (iii) beneficial ownership of such entity.
22
-
23
- "You" (or "Your") shall mean an individual or Legal Entity
24
- exercising permissions granted by this License.
25
-
26
- "Source" form shall mean the preferred form for making modifications,
27
- including but not limited to software source code, documentation
28
- source, and configuration files.
29
-
30
- "Object" form shall mean any form resulting from mechanical
31
- transformation or translation of a Source form, including but
32
- not limited to compiled object code, generated documentation,
33
- and conversions to other media types.
34
-
35
- "Work" shall mean the work of authorship, whether in Source or
36
- Object form, made available under the License, as indicated by a
37
- copyright notice that is included in or attached to the work
38
- (an example is provided in the Appendix below).
39
-
40
- "Derivative Works" shall mean any work, whether in Source or Object
41
- form, that is based on (or derived from) the Work and for which the
42
- editorial revisions, annotations, elaborations, or other modifications
43
- represent, as a whole, an original work of authorship. For the purposes
44
- of this License, Derivative Works shall not include works that remain
45
- separable from, or merely link (or bind by name) to the interfaces of,
46
- the Work and Derivative Works thereof.
47
-
48
- "Contribution" shall mean any work of authorship, including
49
- the original version of the Work and any modifications or additions
50
- to that Work or Derivative Works thereof, that is intentionally
51
- submitted to Licensor for inclusion in the Work by the copyright owner
52
- or by an individual or Legal Entity authorized to submit on behalf of
53
- the copyright owner. For the purposes of this definition, "submitted"
54
- means any form of electronic, verbal, or written communication sent
55
- to the Licensor or its representatives, including but not limited to
56
- communication on electronic mailing lists, source code control systems,
57
- and issue tracking systems that are managed by, or on behalf of, the
58
- Licensor for the purpose of discussing and improving the Work, but
59
- excluding communication that is conspicuously marked or otherwise
60
- designated in writing by the copyright owner as "Not a Contribution."
61
-
62
- "Contributor" shall mean Licensor and any individual or Legal Entity
63
- on behalf of whom a Contribution has been received by Licensor and
64
- subsequently incorporated within the Work.
65
-
66
- 2. Grant of Copyright License. Subject to the terms and conditions of
67
- this License, each Contributor hereby grants to You a perpetual,
68
- worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69
- copyright license to reproduce, prepare Derivative Works of,
70
- publicly display, publicly perform, sublicense, and distribute the
71
- Work and such Derivative Works in Source or Object form.
72
-
73
- 3. Grant of Patent License. Subject to the terms and conditions of
74
- this License, each Contributor hereby grants to You a perpetual,
75
- worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76
- (except as stated in this section) patent license to make, have made,
77
- use, offer to sell, sell, import, and otherwise transfer the Work,
78
- where such license applies only to those patent claims licensable
79
- by such Contributor that are necessarily infringed by their
80
- Contribution(s) alone or by combination of their Contribution(s)
81
- with the Work to which such Contribution(s) was submitted. If You
82
- institute patent litigation against any entity (including a
83
- cross-claim or counterclaim in a lawsuit) alleging that the Work
84
- or a Contribution incorporated within the Work constitutes direct
85
- or contributory patent infringement, then any patent licenses
86
- granted to You under this License for that Work shall terminate
87
- as of the date such litigation is filed.
88
-
89
- 4. Redistribution. You may reproduce and distribute copies of the
90
- Work or Derivative Works thereof in any medium, with or without
91
- modifications, and in Source or Object form, provided that You
92
- meet the following conditions:
93
-
94
- (a) You must give any other recipients of the Work or
95
- Derivative Works a copy of this License; and
96
-
97
- (b) You must cause any modified files to carry prominent notices
98
- stating that You changed the files; and
99
-
100
- (c) You must retain, in the Source form of any Derivative Works
101
- that You distribute, all copyright, patent, trademark, and
102
- attribution notices from the Source form of the Work,
103
- excluding those notices that do not pertain to any part of
104
- the Derivative Works; and
105
-
106
- (d) If the Work includes a "NOTICE" text file as part of its
107
- distribution, then any Derivative Works that You distribute must
108
- include a readable copy of the attribution notices contained
109
- within such NOTICE file, excluding those notices that do not
110
- pertain to any part of the Derivative Works, in at least one
111
- of the following places: within a NOTICE text file distributed
112
- as part of the Derivative Works; within the Source form or
113
- documentation, if provided along with the Derivative Works; or,
114
- within a display generated by the Derivative Works, if and
115
- wherever such third-party notices normally appear. The contents
116
- of the NOTICE file are for informational purposes only and
117
- do not modify the License. You may add Your own attribution
118
- notices within Derivative Works that You distribute, alongside
119
- or as an addendum to the NOTICE text from the Work, provided
120
- that such additional attribution notices cannot be construed
121
- as modifying the License.
122
-
123
- You may add Your own copyright statement to Your modifications and
124
- may provide additional or different license terms and conditions
125
- for use, reproduction, or distribution of Your modifications, or
126
- for any such Derivative Works as a whole, provided Your use,
127
- reproduction, and distribution of the Work otherwise complies with
128
- the conditions stated in this License.
129
-
130
- 5. Submission of Contributions. Unless You explicitly state otherwise,
131
- any Contribution intentionally submitted for inclusion in the Work
132
- by You to the Licensor shall be under the terms and conditions of
133
- this License, without any additional terms or conditions.
134
- Notwithstanding the above, nothing herein shall supersede or modify
135
- the terms of any separate license agreement you may have executed
136
- with Licensor regarding such Contributions.
137
-
138
- 6. Trademarks. This License does not grant permission to use the trade
139
- names, trademarks, service marks, or product names of the Licensor,
140
- except as required for reasonable and customary use in describing the
141
- origin of the Work and reproducing the content of the NOTICE file.
142
-
143
- 7. Disclaimer of Warranty. Unless required by applicable law or
144
- agreed to in writing, Licensor provides the Work (and each
145
- Contributor provides its Contributions) on an "AS IS" BASIS,
146
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147
- implied, including, without limitation, any warranties or conditions
148
- of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149
- PARTICULAR PURPOSE. You are solely responsible for determining the
150
- appropriateness of using or redistributing the Work and assume any
151
- risks associated with Your exercise of permissions under this License.
152
-
153
- 8. Limitation of Liability. In no event and under no legal theory,
154
- whether in tort (including negligence), contract, or otherwise,
155
- unless required by applicable law (such as deliberate and grossly
156
- negligent acts) or agreed to in writing, shall any Contributor be
157
- liable to You for damages, including any direct, indirect, special,
158
- incidental, or consequential damages of any character arising as a
159
- result of this License or out of the use or inability to use the
160
- Work (including but not limited to damages for loss of goodwill,
161
- work stoppage, computer failure or malfunction, or any and all
162
- other commercial damages or losses), even if such Contributor
163
- has been advised of the possibility of such damages.
164
-
165
- 9. Accepting Warranty or Additional Liability. While redistributing
166
- the Work or Derivative Works thereof, You may choose to offer,
167
- and charge a fee for, acceptance of support, warranty, indemnity,
168
- or other liability obligations and/or rights consistent with this
169
- License. However, in accepting such obligations, You may act only
170
- on Your own behalf and on Your sole responsibility, not on behalf
171
- of any other Contributor, and only if You agree to indemnify,
172
- defend, and hold each Contributor harmless for any liability
173
- incurred by, or claims asserted against, such Contributor by reason
174
- of your accepting any such warranty or additional liability.
175
-
176
- END OF TERMS AND CONDITIONS
177
-
178
- APPENDIX: How to apply the Apache License to your work.
179
-
180
- To apply the Apache License to your work, attach the following
181
- boilerplate notice, with the fields enclosed by brackets "[]"
182
- replaced with your own identifying information. (Don't include
183
- the brackets!) The text should be enclosed in the appropriate
184
- comment syntax for the file format. We also recommend that a
185
- file or class name and description of purpose be included on the
186
- same "printed page" as the copyright notice for easier
187
- identification within third-party archives.
188
-
189
- Copyright [yyyy] [name of copyright owner]
190
-
191
- Licensed under the Apache License, Version 2.0 (the "License");
192
- you may not use this file except in compliance with the License.
193
- You may obtain a copy of the License at
194
-
195
- http://www.apache.org/licenses/LICENSE-2.0
196
-
197
- Unless required by applicable law or agreed to in writing, software
198
- distributed under the License is distributed on an "AS IS" BASIS,
199
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200
- See the License for the specific language governing permissions and
201
- limitations under the License.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
app.py CHANGED
@@ -10,8 +10,7 @@ import os
10
  CONFIG_PATH = Path("configs/unet/second_stage.yaml")
11
  CHECKPOINT_PATH = Path("checkpoints/latentsync_unet.pt")
12
 
13
- # subprocess.run(["huggingface-cli", "download", "Hyathi/LatentSync", "--local-dir", "checkpoints", "--exclude", "*.git*", "README.md"])
14
- subprocess.run(["huggingface-cli", "download", "Hyathi/LatentSync", "--local-dir", "checkpoints", "--exclude", "*.git*", "README.md", "--token", os.environ["HF_TOKEN"]])
15
 
16
  def process_video(
17
  video_path,
 
10
  CONFIG_PATH = Path("configs/unet/second_stage.yaml")
11
  CHECKPOINT_PATH = Path("checkpoints/latentsync_unet.pt")
12
 
13
+ subprocess.run(["huggingface-cli", "download", "Hyathi/SoundImage-LipSync", "--local-dir", "checkpoints", "--exclude", "*.git*", "README.md", "--token", os.environ["HF_TOKEN"]])
 
14
 
15
  def process_video(
16
  video_path,
data_processing_pipeline.sh DELETED
@@ -1,9 +0,0 @@
1
- #!/bin/bash
2
-
3
- python -m preprocess.data_processing_pipeline \
4
- --total_num_workers 20 \
5
- --per_gpu_num_workers 10 \
6
- --resolution 256 \
7
- --sync_conf_threshold 3 \
8
- --temp_dir temp \
9
- --input_dir /mnt/bn/maliva-gen-ai-v2/chunyu.li/VoxCeleb2/raw
 
 
 
 
 
 
 
 
 
 
eval/eval_sync_conf.py CHANGED
@@ -18,7 +18,7 @@ import tqdm
18
  from statistics import fmean
19
  from eval.syncnet import SyncNetEval
20
  from eval.syncnet_detect import SyncNetDetector
21
- from latentsync.utils.util import red_text
22
  import torch
23
 
24
 
 
18
  from statistics import fmean
19
  from eval.syncnet import SyncNetEval
20
  from eval.syncnet_detect import SyncNetDetector
21
+ from soundimage.utils.util import red_text
22
  import torch
23
 
24
 
eval/eval_syncnet_acc.py CHANGED
@@ -17,8 +17,8 @@ from tqdm.auto import tqdm
17
  import torch
18
  import torch.nn as nn
19
  from einops import rearrange
20
- from latentsync.models.syncnet import SyncNet
21
- from latentsync.data.syncnet_dataset import SyncNetDataset
22
  from diffusers import AutoencoderKL
23
  from omegaconf import OmegaConf
24
  from accelerate.utils import set_seed
 
17
  import torch
18
  import torch.nn as nn
19
  from einops import rearrange
20
+ from soundimage.models.syncnet import SyncNet
21
+ from soundimage.data.syncnet_dataset import SyncNetDataset
22
  from diffusers import AutoencoderKL
23
  from omegaconf import OmegaConf
24
  from accelerate.utils import set_seed
inference.sh DELETED
@@ -1,9 +0,0 @@
1
- #!/bin/bash
2
-
3
- python -m scripts.inference \
4
- --unet_config_path "configs/unet/second_stage.yaml" \
5
- --inference_ckpt_path "checkpoints/latentsync_unet.pt" \
6
- --guidance_scale 1.0 \
7
- --video_path "assets/demo1_video.mp4" \
8
- --audio_path "assets/demo1_audio.wav" \
9
- --video_out_path "video_out.mp4"
 
 
 
 
 
 
 
 
 
 
latentsync/utils/mask.png DELETED
Binary file (1.87 kB)
 
preprocess/affine_transform.py CHANGED
@@ -12,8 +12,8 @@
12
  # See the License for the specific language governing permissions and
13
  # limitations under the License.
14
 
15
- from latentsync.utils.util import read_video, write_video
16
- from latentsync.utils.image_processor import ImageProcessor
17
  import torch
18
  from einops import rearrange
19
  import os
 
12
  # See the License for the specific language governing permissions and
13
  # limitations under the License.
14
 
15
+ from soundimage.utils.util import read_video, write_video
16
+ from soundimage.utils.image_processor import ImageProcessor
17
  import torch
18
  from einops import rearrange
19
  import os
preprocess/filter_high_resolution.py CHANGED
@@ -13,7 +13,7 @@
13
  # limitations under the License.
14
 
15
  import mediapipe as mp
16
- from latentsync.utils.util import read_video
17
  import os
18
  import tqdm
19
  import shutil
 
13
  # limitations under the License.
14
 
15
  import mediapipe as mp
16
+ from soundimage.utils.util import read_video
17
  import os
18
  import tqdm
19
  import shutil
preprocess/remove_broken_videos.py CHANGED
@@ -16,8 +16,8 @@ import os
16
  from multiprocessing import Pool
17
  import tqdm
18
 
19
- from latentsync.utils.av_reader import AVReader
20
- from latentsync.utils.util import gather_video_paths_recursively
21
 
22
 
23
  def remove_broken_video(video_path):
 
16
  from multiprocessing import Pool
17
  import tqdm
18
 
19
+ from soundimage.utils.av_reader import AVReader
20
+ from soundimage.utils.util import gather_video_paths_recursively
21
 
22
 
23
  def remove_broken_video(video_path):
preprocess/remove_incorrect_affined.py CHANGED
@@ -13,7 +13,7 @@
13
  # limitations under the License.
14
 
15
  import mediapipe as mp
16
- from latentsync.utils.util import read_video, gather_video_paths_recursively
17
  import os
18
  import tqdm
19
  from multiprocessing import Pool
 
13
  # limitations under the License.
14
 
15
  import mediapipe as mp
16
+ from soundimage.utils.util import read_video, gather_video_paths_recursively
17
  import os
18
  import tqdm
19
  from multiprocessing import Pool
scripts/inference.py CHANGED
@@ -16,11 +16,11 @@ import argparse
16
  from omegaconf import OmegaConf
17
  import torch
18
  from diffusers import AutoencoderKL, DDIMScheduler
19
- from latentsync.models.unet import UNet3DConditionModel
20
- from latentsync.pipelines.lipsync_pipeline import LipsyncPipeline
21
  from diffusers.utils.import_utils import is_xformers_available
22
  from accelerate.utils import set_seed
23
- from latentsync.whisper.audio2feature import Audio2Feature
24
 
25
 
26
  def main(config, args):
 
16
  from omegaconf import OmegaConf
17
  import torch
18
  from diffusers import AutoencoderKL, DDIMScheduler
19
+ from soundimage.models.unet import UNet3DConditionModel
20
+ from soundimage.pipelines.lipsync_pipeline import LipsyncPipeline
21
  from diffusers.utils.import_utils import is_xformers_available
22
  from accelerate.utils import set_seed
23
+ from soundimage.whisper.audio2feature import Audio2Feature
24
 
25
 
26
  def main(config, args):
scripts/train_syncnet.py CHANGED
@@ -18,10 +18,10 @@ import logging
18
  from omegaconf import OmegaConf
19
  import shutil
20
 
21
- from latentsync.data.syncnet_dataset import SyncNetDataset
22
- from latentsync.models.syncnet import SyncNet
23
- from latentsync.models.syncnet_wav2lip import SyncNetWav2Lip
24
- from latentsync.utils.util import gather_loss, plot_loss_chart
25
  from accelerate.utils import set_seed
26
 
27
  import torch
@@ -31,7 +31,7 @@ from einops import rearrange
31
  import torch.distributed as dist
32
  from torch.nn.parallel import DistributedDataParallel as DDP
33
  from torch.utils.data.distributed import DistributedSampler
34
- from latentsync.utils.util import init_dist, cosine_loss
35
 
36
  logger = get_logger(__name__)
37
 
 
18
  from omegaconf import OmegaConf
19
  import shutil
20
 
21
+ from soundimage.data.syncnet_dataset import SyncNetDataset
22
+ from soundimage.models.syncnet import SyncNet
23
+ from soundimage.models.syncnet_wav2lip import SyncNetWav2Lip
24
+ from soundimage.utils.util import gather_loss, plot_loss_chart
25
  from accelerate.utils import set_seed
26
 
27
  import torch
 
31
  import torch.distributed as dist
32
  from torch.nn.parallel import DistributedDataParallel as DDP
33
  from torch.utils.data.distributed import DistributedSampler
34
+ from soundimage.utils.util import init_dist, cosine_loss
35
 
36
  logger = get_logger(__name__)
37
 
scripts/train_unet.py CHANGED
@@ -36,18 +36,18 @@ from diffusers.optimization import get_scheduler
36
  from diffusers.utils.import_utils import is_xformers_available
37
  from accelerate.utils import set_seed
38
 
39
- from latentsync.data.unet_dataset import UNetDataset
40
- from latentsync.models.unet import UNet3DConditionModel
41
- from latentsync.models.syncnet import SyncNet
42
- from latentsync.pipelines.lipsync_pipeline import LipsyncPipeline
43
- from latentsync.utils.util import (
44
  init_dist,
45
  cosine_loss,
46
  reversed_forward,
47
  )
48
- from latentsync.utils.util import plot_loss_chart, gather_loss
49
- from latentsync.whisper.audio2feature import Audio2Feature
50
- from latentsync.trepa import TREPALoss
51
  from eval.syncnet import SyncNetEval
52
  from eval.syncnet_detect import SyncNetDetector
53
  from eval.eval_sync_conf import syncnet_eval
 
36
  from diffusers.utils.import_utils import is_xformers_available
37
  from accelerate.utils import set_seed
38
 
39
+ from soundimage.data.unet_dataset import UNetDataset
40
+ from soundimage.models.unet import UNet3DConditionModel
41
+ from soundimage.models.syncnet import SyncNet
42
+ from soundimage.pipelines.lipsync_pipeline import LipsyncPipeline
43
+ from soundimage.utils.util import (
44
  init_dist,
45
  cosine_loss,
46
  reversed_forward,
47
  )
48
+ from soundimage.utils.util import plot_loss_chart, gather_loss
49
+ from soundimage.whisper.audio2feature import Audio2Feature
50
+ from soundimage.trepa import TREPALoss
51
  from eval.syncnet import SyncNetEval
52
  from eval.syncnet_detect import SyncNetDetector
53
  from eval.eval_sync_conf import syncnet_eval
setup_env.sh DELETED
@@ -1,23 +0,0 @@
1
- #!/bin/bash
2
-
3
- # Create a new conda environment
4
- conda create -y -n latentsync python=3.10.13
5
- conda activate latentsync
6
-
7
- # Install ffmpeg
8
- conda install -y -c conda-forge ffmpeg
9
-
10
- # Python dependencies
11
- pip install -r requirements.txt
12
-
13
- # OpenCV dependencies
14
- sudo apt -y install libgl1
15
-
16
- # Download all the checkpoints from HuggingFace
17
- huggingface-cli download Hyathi/LatentSync --local-dir checkpoints --exclude "*.git*" "README.md"
18
-
19
- # Soft links for the auxiliary models
20
- mkdir -p ~/.cache/torch/hub/checkpoints
21
- ln -s $(pwd)/checkpoints/auxiliary/2DFAN4-cd938726ad.zip ~/.cache/torch/hub/checkpoints/2DFAN4-cd938726ad.zip
22
- ln -s $(pwd)/checkpoints/auxiliary/s3fd-619a316812.pth ~/.cache/torch/hub/checkpoints/s3fd-619a316812.pth
23
- ln -s $(pwd)/checkpoints/auxiliary/vgg16-397923af.pth ~/.cache/torch/hub/checkpoints/vgg16-397923af.pth
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
{latentsync β†’ soundimage}/data/syncnet_dataset.py RENAMED
File without changes
{latentsync β†’ soundimage}/data/unet_dataset.py RENAMED
File without changes
{latentsync β†’ soundimage}/models/attention.py RENAMED
File without changes
{latentsync β†’ soundimage}/models/motion_module.py RENAMED
File without changes
{latentsync β†’ soundimage}/models/resnet.py RENAMED
File without changes
{latentsync β†’ soundimage}/models/syncnet.py RENAMED
File without changes
{latentsync β†’ soundimage}/models/syncnet_wav2lip.py RENAMED
File without changes
{latentsync β†’ soundimage}/models/unet.py RENAMED
File without changes
{latentsync β†’ soundimage}/models/unet_blocks.py RENAMED
File without changes
{latentsync β†’ soundimage}/models/utils.py RENAMED
File without changes
{latentsync β†’ soundimage}/pipelines/lipsync_pipeline.py RENAMED
File without changes
{latentsync β†’ soundimage}/trepa/__init__.py RENAMED
File without changes
{latentsync β†’ soundimage}/trepa/third_party/VideoMAEv2/__init__.py RENAMED
File without changes
{latentsync β†’ soundimage}/trepa/third_party/VideoMAEv2/utils.py RENAMED
File without changes
{latentsync β†’ soundimage}/trepa/third_party/VideoMAEv2/videomaev2_finetune.py RENAMED
File without changes
{latentsync β†’ soundimage}/trepa/third_party/VideoMAEv2/videomaev2_pretrain.py RENAMED
File without changes
{latentsync β†’ soundimage}/trepa/third_party/__init__.py RENAMED
File without changes
{latentsync β†’ soundimage}/trepa/utils/__init__.py RENAMED
File without changes
{latentsync β†’ soundimage}/trepa/utils/data_utils.py RENAMED
File without changes
{latentsync β†’ soundimage}/trepa/utils/metric_utils.py RENAMED
File without changes
{latentsync β†’ soundimage}/utils/affine_transform.py RENAMED
File without changes
{latentsync β†’ soundimage}/utils/audio.py RENAMED
File without changes
{latentsync β†’ soundimage}/utils/av_reader.py RENAMED
File without changes
{latentsync β†’ soundimage}/utils/image_processor.py RENAMED
File without changes
{latentsync β†’ soundimage}/utils/util.py RENAMED
File without changes
{latentsync β†’ soundimage}/whisper/audio2feature.py RENAMED
File without changes
{latentsync β†’ soundimage}/whisper/whisper/__init__.py RENAMED
File without changes
{latentsync β†’ soundimage}/whisper/whisper/__main__.py RENAMED
File without changes
{latentsync β†’ soundimage}/whisper/whisper/assets/gpt2/merges.txt RENAMED
File without changes
{latentsync β†’ soundimage}/whisper/whisper/assets/gpt2/special_tokens_map.json RENAMED
File without changes
{latentsync β†’ soundimage}/whisper/whisper/assets/gpt2/tokenizer_config.json RENAMED
File without changes
{latentsync β†’ soundimage}/whisper/whisper/assets/gpt2/vocab.json RENAMED
File without changes
{latentsync β†’ soundimage}/whisper/whisper/assets/mel_filters.npz RENAMED
File without changes
{latentsync β†’ soundimage}/whisper/whisper/assets/multilingual/added_tokens.json RENAMED
File without changes
{latentsync β†’ soundimage}/whisper/whisper/assets/multilingual/merges.txt RENAMED
File without changes