Spaces:

marlenezw
/

audio-driven-animations

Sleeping

App Files Files Community

marlenezw commited on Feb 16, 2023

Commit

3015ca6

2 Parent(s): 075b64e fc4c286

fixing merge conflicts.

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.gitattributes +3 -0
MakeItTalk/animated.py +0 -277
MakeItTalk/marlene_test.ipynb +0 -0
MakeItTalk/thirdparty/AdaptiveWingLoss/core/models.py +228 -228
download.py +17 -17
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/.gitignore +8 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/LICENSE +201 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/README.md +82 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/__init__.py +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/__pycache__/__init__.cpython-37.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/__pycache__/__init__.cpython-39.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/ckpt/.gitkeep +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__init__.py +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__pycache__/__init__.cpython-37.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__pycache__/__init__.cpython-39.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__pycache__/coord_conv.cpython-37.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__pycache__/coord_conv.cpython-39.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__pycache__/models.cpython-37.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__pycache__/models.cpython-39.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/coord_conv.py +157 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/dataloader.py +368 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/evaler.py +151 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/models.py +228 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/eval.py +77 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/images/wflw.png +3 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/images/wflw_table.png +3 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/requirements.txt +12 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/scripts/eval_wflw.sh +10 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/utils/__init__.py +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/utils/__pycache__/__init__.cpython-37.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/utils/__pycache__/__init__.cpython-39.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/utils/__pycache__/utils.cpython-37.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/utils/__pycache__/utils.cpython-39.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/utils/utils.py +354 -0
marlenezw/audio-driven-animations/MakeItTalk/__init__.py +0 -0
marlenezw/audio-driven-animations/MakeItTalk/__pycache__/__init__.cpython-37.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/__pycache__/__init__.cpython-39.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/face_of_art/CODEOWNERS +1 -0
marlenezw/audio-driven-animations/MakeItTalk/face_of_art/LICENCE.txt +21 -0
marlenezw/audio-driven-animations/MakeItTalk/face_of_art/README.md +98 -0
marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__init__.py +0 -0
marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__init__.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/__init__.cpython-36.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/data_loading_functions.cpython-36.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/deep_heatmaps_model_fusion_net.cpython-36.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/deformation_functions.cpython-36.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/logging_functions.cpython-36.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/menpo_functions.cpython-36.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/ops.cpython-36.pyc +0 -0
marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/pdm_clm_functions.cpython-36.pyc +0 -0

.gitattributes CHANGED Viewed

@@ -34,3 +34,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 marlenezw/audio-driven-animations/MakeItTalk/examples/ckpt filter=lfs diff=lfs merge=lfs -text
 MakeItTalk/examples/ckpt filter=lfs diff=lfs merge=lfs -text

 *tfevents* filter=lfs diff=lfs merge=lfs -text
 marlenezw/audio-driven-animations/MakeItTalk/examples/ckpt filter=lfs diff=lfs merge=lfs -text
 MakeItTalk/examples/ckpt filter=lfs diff=lfs merge=lfs -text
+marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/images/wflw.png filter=lfs diff=lfs merge=lfs -text
+marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/images/wflw_table.png filter=lfs diff=lfs merge=lfs -text
+marlenezw/audio-driven-animations/MakeItTalk/face_of_art/old/teaser.png filter=lfs diff=lfs merge=lfs -text

MakeItTalk/animated.py DELETED Viewed

@@ -1,277 +0,0 @@
-# To add a new cell, type '# %%'
-# To add a new markdown cell, type '# %% [markdown]'
-# %%
-import torch
-# this ensures that the current MacOS version is at least 12.3+
-print(torch.backends.mps.is_available())
-# this ensures that the current current PyTorch installation was built with MPS activated.
-print(torch.backends.mps.is_built())
-# %%
-import ipywidgets as widgets
-import glob
-import matplotlib.pyplot as plt
-print("Choose the image name to animate: (saved in folder 'MakeItTalk/examples/')")
-img_list = glob.glob1('MakeItTalk/examples', '*.jpg')
-img_list.sort()
-img_list = [item.split('.')[0] for item in img_list]
-default_head_name = widgets.Dropdown(options=img_list, value='marlene_v2')
-def on_change(change):
-    if change['type'] == 'change' and change['name'] == 'value':
-        plt.imshow(plt.imread('MakeItTalk/examples/{}.jpg'.format(default_head_name.value)))
-        plt.axis('off')
-        plt.show()
-default_head_name.observe(on_change)
-display(default_head_name)
-plt.imshow(plt.imread('MakeItTalk/examples/{}.jpg'.format(default_head_name.value)))
-plt.axis('off')
-plt.show()
-# %%
-#@markdown # Animation Controllers
-#@markdown Amplify the lip motion in horizontal direction
-AMP_LIP_SHAPE_X = 2 #@param {type:"slider", min:0.5, max:5.0, step:0.1}
-#@markdown Amplify the lip motion in vertical direction
-AMP_LIP_SHAPE_Y = 2 #@param {type:"slider", min:0.5, max:5.0, step:0.1}
-#@markdown Amplify the head pose motion (usually smaller than 1.0, put it to 0. for a static head pose)
-AMP_HEAD_POSE_MOTION = 0.35 #@param {type:"slider", min:0.0, max:1.0, step:0.05}
-#@markdown Add naive eye blink
-ADD_NAIVE_EYE = True  #@param ["False", "True"] {type:"raw"}
-#@markdown If your image has an opened mouth, put this as True, else False
-CLOSE_INPUT_FACE_MOUTH = True  #@param ["False", "True"] {type:"raw"}
-#@markdown # Landmark Adjustment
-#@markdown Adjust upper lip thickness (postive value means thicker)
-UPPER_LIP_ADJUST = -1 #@param {type:"slider", min:-3.0, max:3.0, step:1.0}
-#@markdown Adjust lower lip thickness (postive value means thicker)
-LOWER_LIP_ADJUST = -1 #@param {type:"slider", min:-3.0, max:3.0, step:1.0}
-#@markdown Adjust static lip width (in multipication)
-LIP_WIDTH_ADJUST = 1.0 #@param {type:"slider", min:0.8, max:1.2, step:0.01}
-# %%
-import sys
-sys.path.append("thirdparty/AdaptiveWingLoss")
-import os, glob
-import numpy as np
-import cv2
-import argparse
-from src.approaches.train_image_translation import Image_translation_block
-import torch
-import pickle
-import face_alignment
-from face_alignment import face_alignment
-from src.autovc.AutoVC_mel_Convertor_retrain_version import AutoVC_mel_Convertor
-import shutil
-import time
-import util.utils as util
-from scipy.signal import savgol_filter
-from src.approaches.train_audio2landmark import Audio2landmark_model
-# %%
-sys.stdout = open(os.devnull, 'a')
-parser = argparse.ArgumentParser()
-parser.add_argument('--jpg', type=str, default='{}.jpg'.format(default_head_name.value))
-parser.add_argument('--close_input_face_mouth', default=CLOSE_INPUT_FACE_MOUTH, action='store_true')
-parser.add_argument('--load_AUTOVC_name', type=str, default='MakeItTalk/examples/ckpt/ckpt_autovc.pth')
-parser.add_argument('--load_a2l_G_name', type=str, default='MakeItTalk/examples/ckpt/ckpt_speaker_branch.pth')
-parser.add_argument('--load_a2l_C_name', type=str, default='MakeItTalk/examples/ckpt/ckpt_content_branch.pth') #ckpt_audio2landmark_c.pth')
-parser.add_argument('--load_G_name', type=str, default='MakeItTalk/examples/ckpt/ckpt_116_i2i_comb.pth') #ckpt_image2image.pth') #ckpt_i2i_finetune_150.pth') #c
-parser.add_argument('--amp_lip_x', type=float, default=AMP_LIP_SHAPE_X)
-parser.add_argument('--amp_lip_y', type=float, default=AMP_LIP_SHAPE_Y)
-parser.add_argument('--amp_pos', type=float, default=AMP_HEAD_POSE_MOTION)
-parser.add_argument('--reuse_train_emb_list', type=str, nargs='+', default=[]) #  ['iWeklsXc0H8']) #['45hn7-LXDX8']) #['E_kmpT-EfOg']) #'iWeklsXc0H8', '29k8RtSUjE0', '45hn7-LXDX8',
-parser.add_argument('--add_audio_in', default=False, action='store_true')
-parser.add_argument('--comb_fan_awing', default=False, action='store_true')
-parser.add_argument('--output_folder', type=str, default='MakeItTalk/examples')
-parser.add_argument('--test_end2end', default=True, action='store_true')
-parser.add_argument('--dump_dir', type=str, default='', help='')
-parser.add_argument('--pos_dim', default=7, type=int)
-parser.add_argument('--use_prior_net', default=True, action='store_true')
-parser.add_argument('--transformer_d_model', default=32, type=int)
-parser.add_argument('--transformer_N', default=2, type=int)
-parser.add_argument('--transformer_heads', default=2, type=int)
-parser.add_argument('--spk_emb_enc_size', default=16, type=int)
-parser.add_argument('--init_content_encoder', type=str, default='')
-parser.add_argument('--lr', type=float, default=1e-3, help='learning rate')
-parser.add_argument('--reg_lr', type=float, default=1e-6, help='weight decay')
-parser.add_argument('--write', default=False, action='store_true')
-parser.add_argument('--segment_batch_size', type=int, default=1, help='batch size')
-parser.add_argument('--emb_coef', default=3.0, type=float)
-parser.add_argument('--lambda_laplacian_smooth_loss', default=1.0, type=float)
-parser.add_argument('--use_11spk_only', default=False, action='store_true')
-parser.add_argument('-f')
-opt_parser = parser.parse_args()
-# %%
-img = cv2.imread('MakeItTalk/examples/' + opt_parser.jpg)
-plt.imshow(img)
-# %%
-predictor = face_alignment.FaceAlignment(face_alignment.LandmarksType._3D, device='mps', flip_input=True)
-shapes = predictor.get_landmarks(img)
-if (not shapes or len(shapes) != 1):
-    print('Cannot detect face landmarks. Exit.')
-    exit(-1)
-shape_3d = shapes[0]
-# %%
-if(opt_parser.close_input_face_mouth):
-    util.close_input_face_mouth(shape_3d)
-shape_3d[48:, 0] = (shape_3d[48:, 0] - np.mean(shape_3d[48:, 0])) * LIP_WIDTH_ADJUST + np.mean(shape_3d[48:, 0]) # wider lips
-shape_3d[49:54, 1] -= UPPER_LIP_ADJUST           # thinner upper lip
-shape_3d[55:60, 1] += LOWER_LIP_ADJUST           # thinner lower lip
-shape_3d[[37,38,43,44], 1] -=2.    # larger eyes
-shape_3d[[40,41,46,47], 1] +=2.    # larger eyes
-shape_3d, scale, shift = util.norm_input_face(shape_3d)
-print("Loaded Image...", file=sys.stderr)
-# %%
-au_data = []
-au_emb = []
-ains = glob.glob1('MakeItTalk/examples', '*.wav')
-ains = [item for item in ains if item != 'tmp.wav']
-ains.sort()
-for ain in ains:
-    os.system('ffmpeg -y -loglevel error -i MakeItTalk/examples/{} -ar 16000 MakeItTalk/examples/tmp.wav'.format(ain))
-    shutil.copyfile('MakeItTalk/examples/tmp.wav', 'MakeItTalk/examples/{}'.format(ain))
-    # au embedding
-    from thirdparty.resemblyer_util.speaker_emb import get_spk_emb
-    me, ae = get_spk_emb('MakeItTalk/examples/{}'.format(ain))
-    au_emb.append(me.reshape(-1))
-    print('Processing audio file', ain)
-    c = AutoVC_mel_Convertor('MakeItTalk/examples')
-    au_data_i = c.convert_single_wav_to_autovc_input(audio_filename=os.path.join('MakeItTalk/examples', ain),
-           autovc_model_path=opt_parser.load_AUTOVC_name)
-    au_data += au_data_i
-if(os.path.isfile('MakeItTalk/examples/tmp.wav')):
-    os.remove('MakeItTalk/examples/tmp.wav')
-print("Loaded audio...", file=sys.stderr)
-# %%
-# landmark fake placeholder
-fl_data = []
-rot_tran, rot_quat, anchor_t_shape = [], [], []
-for au, info in au_data:
-    au_length = au.shape[0]
-    fl = np.zeros(shape=(au_length, 68 * 3))
-    fl_data.append((fl, info))
-    rot_tran.append(np.zeros(shape=(au_length, 3, 4)))
-    rot_quat.append(np.zeros(shape=(au_length, 4)))
-    anchor_t_shape.append(np.zeros(shape=(au_length, 68 * 3)))
-if(os.path.exists(os.path.join('MakeItTalk/examples', 'dump', 'random_val_fl.pickle'))):
-    os.remove(os.path.join('MakeItTalk/examples', 'dump', 'random_val_fl.pickle'))
-if(os.path.exists(os.path.join('MakeItTalk/examples', 'dump', 'random_val_fl_interp.pickle'))):
-    os.remove(os.path.join('MakeItTalk/examples', 'dump', 'random_val_fl_interp.pickle'))
-if(os.path.exists(os.path.join('MakeItTalk/examples', 'dump', 'random_val_au.pickle'))):
-    os.remove(os.path.join('MakeItTalk/examples', 'dump', 'random_val_au.pickle'))
-if (os.path.exists(os.path.join('MakeItTalk/examples', 'dump', 'random_val_gaze.pickle'))):
-    os.remove(os.path.join('MakeItTalk/examples', 'dump', 'random_val_gaze.pickle'))
-with open(os.path.join('MakeItTalk/examples', 'dump', 'random_val_fl.pickle'), 'wb') as fp:
-    pickle.dump(fl_data, fp)
-with open(os.path.join('MakeItTalk/examples', 'dump', 'random_val_au.pickle'), 'wb') as fp:
-    pickle.dump(au_data, fp)
-with open(os.path.join('MakeItTalk/examples', 'dump', 'random_val_gaze.pickle'), 'wb') as fp:
-    gaze = {'rot_trans':rot_tran, 'rot_quat':rot_quat, 'anchor_t_shape':anchor_t_shape}
-    pickle.dump(gaze, fp)
-# %%
-model = Audio2landmark_model(opt_parser, jpg_shape=shape_3d)
-if(len(opt_parser.reuse_train_emb_list) == 0):
-    model.test(au_emb=au_emb)
-else:
-    model.test(au_emb=None)
-print("Audio->Landmark...", file=sys.stderr)
-# %%
-fls = glob.glob1('MakeItTalk/examples', 'pred_fls_*.txt')
-fls.sort()
-for i in range(0,len(fls)):
-    fl = np.loadtxt(os.path.join('MakeItTalk/examples', fls[i])).reshape((-1, 68,3))
-    print(fls[i])
-    fl[:, :, 0:2] = -fl[:, :, 0:2]
-    fl[:, :, 0:2] = fl[:, :, 0:2] / scale - shift
-    if (ADD_NAIVE_EYE):
-        fl = util.add_naive_eye(fl)
-    # additional smooth
-    fl = fl.reshape((-1, 204))
-    fl[:, :48 * 3] = savgol_filter(fl[:, :48 * 3], 15, 3, axis=0)
-    fl[:, 48*3:] = savgol_filter(fl[:, 48*3:], 5, 3, axis=0)
-    fl = fl.reshape((-1, 68, 3))
-    ''' STEP 6: Imag2image translation '''
-    model = Image_translation_block(opt_parser, single_test=True)
-    with torch.no_grad():
-        model.single_test(jpg=img, fls=fl, filename=fls[i], prefix=opt_parser.jpg.split('.')[0])
-        print('finish image2image gen')
-    os.remove(os.path.join('MakeItTalk/examples', fls[i]))
-    print("{} / {}: Landmark->Face...".format(i+1, len(fls)), file=sys.stderr)
-print("Done!", file=sys.stderr)
-# %% [markdown]
-# # Generated video from image and sound clip
-# %%
-from IPython.display import Video
-Video("MakeItTalk/examples/marlenes_v1.mp4")
-# %%
-# %%
-from IPython.display import HTML
-from base64 import b64encode
-for ain in ains:
-  OUTPUT_MP4_NAME = '{}_pred_fls_{}_audio_embed.mp4'.format(
-    opt_parser.jpg.split('.')[0],
-    ain.split('.')[0]
-    )
-  mp4 = open('MakeItTalk/examples/{}'.format(OUTPUT_MP4_NAME),'rb').read()
-  data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
-  print('Display animation: MakeItTalk/examples/{}'.format(OUTPUT_MP4_NAME), file=sys.stderr)
-  display(HTML("""
-  <video width=600 controls>
-        <source src="%s" type="video/mp4">
-  </video>
-  """ % data_url))

MakeItTalk/marlene_test.ipynb DELETED Viewed

The diff for this file is too large to render. See raw diff

MakeItTalk/thirdparty/AdaptiveWingLoss/core/models.py CHANGED Viewed

@@ -1,228 +1,228 @@
-import torch
-import torch.nn as nn
-import torch.nn.functional as F
-import math
-from thirdparty.AdaptiveWingLoss.core.coord_conv import CoordConvTh
-def conv3x3(in_planes, out_planes, strd=1, padding=1,
-            bias=False,dilation=1):
-    "3x3 convolution with padding"
-    return nn.Conv2d(in_planes, out_planes, kernel_size=3,
-                     stride=strd, padding=padding, bias=bias,
-                     dilation=dilation)
-class BasicBlock(nn.Module):
-    expansion = 1
-    def __init__(self, inplanes, planes, stride=1, downsample=None):
-        super(BasicBlock, self).__init__()
-        self.conv1 = conv3x3(inplanes, planes, stride)
-        # self.bn1 = nn.BatchNorm2d(planes)
-        self.relu = nn.ReLU(inplace=True)
-        self.conv2 = conv3x3(planes, planes)
-        # self.bn2 = nn.BatchNorm2d(planes)
-        self.downsample = downsample
-        self.stride = stride
-    def forward(self, x):
-        residual = x
-        out = self.conv1(x)
-        # out = self.bn1(out)
-        out = self.relu(out)
-        out = self.conv2(out)
-        # out = self.bn2(out)
-        if self.downsample is not None:
-            residual = self.downsample(x)
-        out += residual
-        out = self.relu(out)
-        return out
-class ConvBlock(nn.Module):
-    def __init__(self, in_planes, out_planes):
-        super(ConvBlock, self).__init__()
-        self.bn1 = nn.BatchNorm2d(in_planes)
-        self.conv1 = conv3x3(in_planes, int(out_planes / 2))
-        self.bn2 = nn.BatchNorm2d(int(out_planes / 2))
-        self.conv2 = conv3x3(int(out_planes / 2), int(out_planes / 4),
-                             padding=1, dilation=1)
-        self.bn3 = nn.BatchNorm2d(int(out_planes / 4))
-        self.conv3 = conv3x3(int(out_planes / 4), int(out_planes / 4),
-                             padding=1, dilation=1)
-        if in_planes != out_planes:
-            self.downsample = nn.Sequential(
-                nn.BatchNorm2d(in_planes),
-                nn.ReLU(True),
-                nn.Conv2d(in_planes, out_planes,
-                          kernel_size=1, stride=1, bias=False),
-            )
-        else:
-            self.downsample = None
-    def forward(self, x):
-        residual = x
-        out1 = self.bn1(x)
-        out1 = F.relu(out1, True)
-        out1 = self.conv1(out1)
-        out2 = self.bn2(out1)
-        out2 = F.relu(out2, True)
-        out2 = self.conv2(out2)
-        out3 = self.bn3(out2)
-        out3 = F.relu(out3, True)
-        out3 = self.conv3(out3)
-        out3 = torch.cat((out1, out2, out3), 1)
-        if self.downsample is not None:
-            residual = self.downsample(residual)
-        out3 += residual
-        return out3
-class HourGlass(nn.Module):
-    def __init__(self, num_modules, depth, num_features, first_one=False):
-        super(HourGlass, self).__init__()
-        self.num_modules = num_modules
-        self.depth = depth
-        self.features = num_features
-        self.coordconv = CoordConvTh(x_dim=64, y_dim=64,
-                                     with_r=True, with_boundary=True,
-                                     in_channels=256, first_one=first_one,
-                                     out_channels=256,
-                                     kernel_size=1,
-                                     stride=1, padding=0)
-        self._generate_network(self.depth)
-    def _generate_network(self, level):
-        self.add_module('b1_' + str(level), ConvBlock(256, 256))
-        self.add_module('b2_' + str(level), ConvBlock(256, 256))
-        if level > 1:
-            self._generate_network(level - 1)
-        else:
-            self.add_module('b2_plus_' + str(level), ConvBlock(256, 256))
-        self.add_module('b3_' + str(level), ConvBlock(256, 256))
-    def _forward(self, level, inp):
-        # Upper branch
-        up1 = inp
-        up1 = self._modules['b1_' + str(level)](up1)
-        # Lower branch
-        low1 = F.avg_pool2d(inp, 2, stride=2)
-        low1 = self._modules['b2_' + str(level)](low1)
-        if level > 1:
-            low2 = self._forward(level - 1, low1)
-        else:
-            low2 = low1
-            low2 = self._modules['b2_plus_' + str(level)](low2)
-        low3 = low2
-        low3 = self._modules['b3_' + str(level)](low3)
-        up2 = F.upsample(low3, scale_factor=2, mode='nearest')
-        return up1 + up2
-    def forward(self, x, heatmap):
-        x, last_channel = self.coordconv(x, heatmap)
-        return self._forward(self.depth, x), last_channel
-class FAN(nn.Module):
-    def __init__(self, num_modules=1, end_relu=False, gray_scale=False,
-                 num_landmarks=68):
-        super(FAN, self).__init__()
-        self.num_modules = num_modules
-        self.gray_scale = gray_scale
-        self.end_relu = end_relu
-        self.num_landmarks = num_landmarks
-        # Base part
-        if self.gray_scale:
-            self.conv1 = CoordConvTh(x_dim=256, y_dim=256,
-                                     with_r=True, with_boundary=False,
-                                     in_channels=3, out_channels=64,
-                                     kernel_size=7,
-                                     stride=2, padding=3)
-        else:
-            self.conv1 = CoordConvTh(x_dim=256, y_dim=256,
-                                     with_r=True, with_boundary=False,
-                                     in_channels=3, out_channels=64,
-                                     kernel_size=7,
-                                     stride=2, padding=3)
-        self.bn1 = nn.BatchNorm2d(64)
-        self.conv2 = ConvBlock(64, 128)
-        self.conv3 = ConvBlock(128, 128)
-        self.conv4 = ConvBlock(128, 256)
-        # Stacking part
-        for hg_module in range(self.num_modules):
-            if hg_module == 0:
-                first_one = True
-            else:
-                first_one = False
-            self.add_module('m' + str(hg_module), HourGlass(1, 4, 256,
-                                                            first_one))
-            self.add_module('top_m_' + str(hg_module), ConvBlock(256, 256))
-            self.add_module('conv_last' + str(hg_module),
-                            nn.Conv2d(256, 256, kernel_size=1, stride=1, padding=0))
-            self.add_module('bn_end' + str(hg_module), nn.BatchNorm2d(256))
-            self.add_module('l' + str(hg_module), nn.Conv2d(256,
-                                                            num_landmarks+1, kernel_size=1, stride=1, padding=0))
-            if hg_module < self.num_modules - 1:
-                self.add_module(
-                    'bl' + str(hg_module), nn.Conv2d(256, 256, kernel_size=1, stride=1, padding=0))
-                self.add_module('al' + str(hg_module), nn.Conv2d(num_landmarks+1,
-                                                                 256, kernel_size=1, stride=1, padding=0))
-    def forward(self, x):
-        x, _ = self.conv1(x)
-        x = F.relu(self.bn1(x), True)
-        # x = F.relu(self.bn1(self.conv1(x)), True)
-        x = F.avg_pool2d(self.conv2(x), 2, stride=2)
-        x = self.conv3(x)
-        x = self.conv4(x)
-        previous = x
-        outputs = []
-        boundary_channels = []
-        tmp_out = None
-        for i in range(self.num_modules):
-            hg, boundary_channel = self._modules['m' + str(i)](previous,
-                                                               tmp_out)
-            ll = hg
-            ll = self._modules['top_m_' + str(i)](ll)
-            ll = F.relu(self._modules['bn_end' + str(i)]
-                        (self._modules['conv_last' + str(i)](ll)), True)
-            # Predict heatmaps
-            tmp_out = self._modules['l' + str(i)](ll)
-            if self.end_relu:
-                tmp_out = F.relu(tmp_out) # HACK: Added relu
-            outputs.append(tmp_out)
-            boundary_channels.append(boundary_channel)
-            if i < self.num_modules - 1:
-                ll = self._modules['bl' + str(i)](ll)
-                tmp_out_ = self._modules['al' + str(i)](tmp_out)
-                previous = previous + ll + tmp_out_
-        return outputs, boundary_channels

+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import math
+from thirdparty.AdaptiveWingLoss.core.coord_conv import CoordConvTh
+def conv3x3(in_planes, out_planes, strd=1, padding=1,
+            bias=False,dilation=1):
+    "3x3 convolution with padding"
+    return nn.Conv2d(in_planes, out_planes, kernel_size=3,
+                     stride=strd, padding=padding, bias=bias,
+                     dilation=dilation)
+class BasicBlock(nn.Module):
+    expansion = 1
+    def __init__(self, inplanes, planes, stride=1, downsample=None):
+        super(BasicBlock, self).__init__()
+        self.conv1 = conv3x3(inplanes, planes, stride)
+        # self.bn1 = nn.BatchNorm2d(planes)
+        self.relu = nn.ReLU(inplace=True)
+        self.conv2 = conv3x3(planes, planes)
+        # self.bn2 = nn.BatchNorm2d(planes)
+        self.downsample = downsample
+        self.stride = stride
+    def forward(self, x):
+        residual = x
+        out = self.conv1(x)
+        # out = self.bn1(out)
+        out = self.relu(out)
+        out = self.conv2(out)
+        # out = self.bn2(out)
+        if self.downsample is not None:
+            residual = self.downsample(x)
+        out += residual
+        out = self.relu(out)
+        return out
+class ConvBlock(nn.Module):
+    def __init__(self, in_planes, out_planes):
+        super(ConvBlock, self).__init__()
+        self.bn1 = nn.BatchNorm2d(in_planes)
+        self.conv1 = conv3x3(in_planes, int(out_planes / 2))
+        self.bn2 = nn.BatchNorm2d(int(out_planes / 2))
+        self.conv2 = conv3x3(int(out_planes / 2), int(out_planes / 4),
+                             padding=1, dilation=1)
+        self.bn3 = nn.BatchNorm2d(int(out_planes / 4))
+        self.conv3 = conv3x3(int(out_planes / 4), int(out_planes / 4),
+                             padding=1, dilation=1)
+        if in_planes != out_planes:
+            self.downsample = nn.Sequential(
+                nn.BatchNorm2d(in_planes),
+                nn.ReLU(True),
+                nn.Conv2d(in_planes, out_planes,
+                          kernel_size=1, stride=1, bias=False),
+            )
+        else:
+            self.downsample = None
+    def forward(self, x):
+        residual = x
+        out1 = self.bn1(x)
+        out1 = F.relu(out1, True)
+        out1 = self.conv1(out1)
+        out2 = self.bn2(out1)
+        out2 = F.relu(out2, True)
+        out2 = self.conv2(out2)
+        out3 = self.bn3(out2)
+        out3 = F.relu(out3, True)
+        out3 = self.conv3(out3)
+        out3 = torch.cat((out1, out2, out3), 1)
+        if self.downsample is not None:
+            residual = self.downsample(residual)
+        out3 += residual
+        return out3
+class HourGlass(nn.Module):
+    def __init__(self, num_modules, depth, num_features, first_one=False):
+        super(HourGlass, self).__init__()
+        self.num_modules = num_modules
+        self.depth = depth
+        self.features = num_features
+        self.coordconv = CoordConvTh(x_dim=64, y_dim=64,
+                                     with_r=True, with_boundary=True,
+                                     in_channels=256, first_one=first_one,
+                                     out_channels=256,
+                                     kernel_size=1,
+                                     stride=1, padding=0)
+        self._generate_network(self.depth)
+    def _generate_network(self, level):
+        self.add_module('b1_' + str(level), ConvBlock(256, 256))
+        self.add_module('b2_' + str(level), ConvBlock(256, 256))
+        if level > 1:
+            self._generate_network(level - 1)
+        else:
+            self.add_module('b2_plus_' + str(level), ConvBlock(256, 256))
+        self.add_module('b3_' + str(level), ConvBlock(256, 256))
+    def _forward(self, level, inp):
+        # Upper branch
+        up1 = inp
+        up1 = self._modules['b1_' + str(level)](up1)
+        # Lower branch
+        low1 = F.avg_pool2d(inp, 2, stride=2)
+        low1 = self._modules['b2_' + str(level)](low1)
+        if level > 1:
+            low2 = self._forward(level - 1, low1)
+        else:
+            low2 = low1
+            low2 = self._modules['b2_plus_' + str(level)](low2)
+        low3 = low2
+        low3 = self._modules['b3_' + str(level)](low3)
+        up2 = F.upsample(low3, scale_factor=2, mode='nearest')
+        return up1 + up2
+    def forward(self, x, heatmap):
+        x, last_channel = self.coordconv(x, heatmap)
+        return self._forward(self.depth, x), last_channel
+class FAN(nn.Module):
+    def __init__(self, num_modules=1, end_relu=False, gray_scale=False,
+                 num_landmarks=68):
+        super(FAN, self).__init__()
+        self.num_modules = num_modules
+        self.gray_scale = gray_scale
+        self.end_relu = end_relu
+        self.num_landmarks = num_landmarks
+        # Base part
+        if self.gray_scale:
+            self.conv1 = CoordConvTh(x_dim=256, y_dim=256,
+                                     with_r=True, with_boundary=False,
+                                     in_channels=3, out_channels=64,
+                                     kernel_size=7,
+                                     stride=2, padding=3)
+        else:
+            self.conv1 = CoordConvTh(x_dim=256, y_dim=256,
+                                     with_r=True, with_boundary=False,
+                                     in_channels=3, out_channels=64,
+                                     kernel_size=7,
+                                     stride=2, padding=3)
+        self.bn1 = nn.BatchNorm2d(64)
+        self.conv2 = ConvBlock(64, 128)
+        self.conv3 = ConvBlock(128, 128)
+        self.conv4 = ConvBlock(128, 256)
+        # Stacking part
+        for hg_module in range(self.num_modules):
+            if hg_module == 0:
+                first_one = True
+            else:
+                first_one = False
+            self.add_module('m' + str(hg_module), HourGlass(1, 4, 256,
+                                                            first_one))
+            self.add_module('top_m_' + str(hg_module), ConvBlock(256, 256))
+            self.add_module('conv_last' + str(hg_module),
+                            nn.Conv2d(256, 256, kernel_size=1, stride=1, padding=0))
+            self.add_module('bn_end' + str(hg_module), nn.BatchNorm2d(256))
+            self.add_module('l' + str(hg_module), nn.Conv2d(256,
+                                                            num_landmarks+1, kernel_size=1, stride=1, padding=0))
+            if hg_module < self.num_modules - 1:
+                self.add_module(
+                    'bl' + str(hg_module), nn.Conv2d(256, 256, kernel_size=1, stride=1, padding=0))
+                self.add_module('al' + str(hg_module), nn.Conv2d(num_landmarks+1,
+                                                                 256, kernel_size=1, stride=1, padding=0))
+    def forward(self, x):
+        x, _ = self.conv1(x)
+        x = F.relu(self.bn1(x), True)
+        # x = F.relu(self.bn1(self.conv1(x)), True)
+        x = F.avg_pool2d(self.conv2(x), 2, stride=2)
+        x = self.conv3(x)
+        x = self.conv4(x)
+        previous = x
+        outputs = []
+        boundary_channels = []
+        tmp_out = None
+        for i in range(self.num_modules):
+            hg, boundary_channel = self._modules['m' + str(i)](previous,
+                                                               tmp_out)
+            ll = hg
+            ll = self._modules['top_m_' + str(i)](ll)
+            ll = F.relu(self._modules['bn_end' + str(i)]
+                        (self._modules['conv_last' + str(i)](ll)), True)
+            # Predict heatmaps
+            tmp_out = self._modules['l' + str(i)](ll)
+            if self.end_relu:
+                tmp_out = F.relu(tmp_out) # HACK: Added relu
+            outputs.append(tmp_out)
+            boundary_channels.append(boundary_channel)
+            if i < self.num_modules - 1:
+                ll = self._modules['bl' + str(i)](ll)
+                tmp_out_ = self._modules['al' + str(i)](tmp_out)
+                previous = previous + ll + tmp_out_
+        return outputs, boundary_channels

download.py CHANGED Viewed

@@ -1,18 +1,18 @@
-from huggingface_hub import hf_hub_download
-from huggingface_hub import snapshot_download
-#download files
-def download_file(repo_name, filename, repo_type):
-    file_location = hf_hub_download(repo_id=repo_name, filename=filename,repo_type=repo_type)
-    return file_location
-#download a folder
-def download_folder(repo_name, revision='main'):
-    folder_location = snapshot_download(repo_id=repo_name, revision=revision)
-    return folder_location

+from huggingface_hub import hf_hub_download
+from huggingface_hub import snapshot_download
+#download files
+def download_file(repo_name, filename, repo_type):
+    file_location = hf_hub_download(repo_id=repo_name, filename=filename,repo_type=repo_type)
+    return file_location
+#download a folder
+def download_folder(repo_name, revision='main'):
+    folder_location = snapshot_download(repo_id=repo_name, revision=revision)
+    return folder_location

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/.gitignore ADDED Viewed

	@@ -0,0 +1,8 @@

+# Python generated files
+*.pyc
+# Project related files
+ckpt/*.pth
+dataset/*
+!dataset/!.py
+experiments/*

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/LICENSE ADDED Viewed

	@@ -0,0 +1,201 @@

+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+   1. Definitions.
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+   END OF TERMS AND CONDITIONS
+   APPENDIX: How to apply the Apache License to your work.
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+   Copyright [yyyy] [name of copyright owner]
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+       http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/README.md ADDED Viewed

	@@ -0,0 +1,82 @@

+# AdaptiveWingLoss
+## [arXiv](https://arxiv.org/abs/1904.07399)
+Pytorch Implementation of Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression.
+<img src='images/wflw.png' width="1000px">
+## Update Logs:
+### October 28, 2019
+* Pretrained Model and evaluation code on WFLW dataset is released.
+## Installation
+#### Note: Code was originally developed under Python2.X and Pytorch 0.4. This released version was revisioned from original code and was tested on Python3.5.7 and Pytorch 1.3.0.
+Install system requirements:
+```
+sudo apt-get install python3-dev python3-pip python3-tk libglib2.0-0
+```
+Install python dependencies:
+```
+pip3 install -r requirements.txt
+```
+## Run Evaluation on WFLW dataset
+1. Download and process WFLW dataset
+    * Download WFLW dataset and annotation from [Here](https://wywu.github.io/projects/LAB/WFLW.html).
+    * Unzip WFLW dataset and annotations and move files into ```./dataset``` directory. Your directory should look like this:
+        ```
+        AdaptiveWingLoss
+        └───dataset
+           │
+           └───WFLW_annotations
+           │   └───list_98pt_rect_attr_train_test
+           │   │
+           │   └───list_98pt_test
+           │
+           └───WFLW_images
+               └───0--Parade
+               │
+               └───...
+        ```
+    * Inside ```./dataset``` directory, run:
+        ```
+        python convert_WFLW.py
+        ```
+        A new directory ```./dataset/WFLW_test``` should be generated with 2500 processed testing images and corresponding landmarks.
+2. Download pretrained model from [Google Drive](https://drive.google.com/file/d/1HZaSjLoorQ4QCEx7PRTxOmg0bBPYSqhH/view?usp=sharing) and put it in ```./ckpt``` directory.
+3. Within ```./Scripts``` directory, run following command:
+    ```
+    sh eval_wflw.sh
+    ```
+    <img src='images/wflw_table.png' width="800px">
+    *GTBbox indicates the ground truth landmarks are used as bounding box to crop faces.
+## Future Plans
+- [x] Release evaluation code and pretrained model on WFLW dataset.
+- [ ] Release training code on WFLW dataset.
+- [ ] Release pretrained model and code on 300W, AFLW and COFW dataset.
+- [ ] Replease facial landmark detection API
+## Citation
+If you find this useful for your research, please cite the following paper.
+```
+@InProceedings{Wang_2019_ICCV,
+author = {Wang, Xinyao and Bo, Liefeng and Fuxin, Li},
+title = {Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression},
+booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
+month = {October},
+year = {2019}
+}
+```
+## Acknowledgments
+This repository borrows or partially modifies hourglass model and data processing code from [face alignment](https://github.com/1adrianb/face-alignment) and [pose-hg-train](https://github.com/princeton-vl/pose-hg-train).

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/__init__.py ADDED Viewed

File without changes

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/__pycache__/__init__.cpython-37.pyc ADDED Viewed

Binary file (164 Bytes). View file

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/__pycache__/__init__.cpython-39.pyc ADDED Viewed

Binary file (179 Bytes). View file

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/ckpt/.gitkeep ADDED Viewed

File without changes

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__init__.py ADDED Viewed

File without changes

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__pycache__/__init__.cpython-37.pyc ADDED Viewed

Binary file (169 Bytes). View file

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__pycache__/__init__.cpython-39.pyc ADDED Viewed

Binary file (184 Bytes). View file

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__pycache__/coord_conv.cpython-37.pyc ADDED Viewed

Binary file (4.33 kB). View file

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__pycache__/coord_conv.cpython-39.pyc ADDED Viewed

Binary file (4.38 kB). View file

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__pycache__/models.cpython-37.pyc ADDED Viewed

Binary file (5.77 kB). View file

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/__pycache__/models.cpython-39.pyc ADDED Viewed

Binary file (5.83 kB). View file

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/coord_conv.py ADDED Viewed

	@@ -0,0 +1,157 @@

+import torch
+import torch.nn as nn
+device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+class AddCoordsTh(nn.Module):
+    def __init__(self, x_dim=64, y_dim=64, with_r=False, with_boundary=False):
+        super(AddCoordsTh, self).__init__()
+        self.x_dim = x_dim
+        self.y_dim = y_dim
+        self.with_r = with_r
+        self.with_boundary = with_boundary
+    def forward(self, input_tensor, heatmap=None):
+        """
+        input_tensor: (batch, c, x_dim, y_dim)
+        """
+        batch_size_tensor = input_tensor.shape[0]
+        xx_ones = torch.ones([1, self.y_dim], dtype=torch.int32).to(device)
+        xx_ones = xx_ones.unsqueeze(-1)
+        xx_range = torch.arange(self.x_dim, dtype=torch.int32).unsqueeze(0).to(device)
+        xx_range = xx_range.unsqueeze(1)
+        xx_channel = torch.matmul(xx_ones.float(), xx_range.float())
+        xx_channel = xx_channel.unsqueeze(-1)
+        yy_ones = torch.ones([1, self.x_dim], dtype=torch.int32).to(device)
+        yy_ones = yy_ones.unsqueeze(1)
+        yy_range = torch.arange(self.y_dim, dtype=torch.int32).unsqueeze(0).to(device)
+        yy_range = yy_range.unsqueeze(-1)
+        yy_channel = torch.matmul(yy_range.float(), yy_ones.float())
+        yy_channel = yy_channel.unsqueeze(-1)
+        xx_channel = xx_channel.permute(0, 3, 2, 1)
+        yy_channel = yy_channel.permute(0, 3, 2, 1)
+        xx_channel = xx_channel / (self.x_dim - 1)
+        yy_channel = yy_channel / (self.y_dim - 1)
+        xx_channel = xx_channel * 2 - 1
+        yy_channel = yy_channel * 2 - 1
+        xx_channel = xx_channel.repeat(batch_size_tensor, 1, 1, 1)
+        yy_channel = yy_channel.repeat(batch_size_tensor, 1, 1, 1)
+        if self.with_boundary and type(heatmap) != type(None):
+            boundary_channel = torch.clamp(heatmap[:, -1:, :, :],
+                                        0.0, 1.0)
+            zero_tensor = torch.zeros_like(xx_channel)
+            xx_boundary_channel = torch.where(boundary_channel>0.05,
+                                              xx_channel, zero_tensor)
+            yy_boundary_channel = torch.where(boundary_channel>0.05,
+                                              yy_channel, zero_tensor)
+        if self.with_boundary and type(heatmap) != type(None):
+            xx_boundary_channel = xx_boundary_channel.to(device)
+            yy_boundary_channel = yy_boundary_channel.to(device)
+        ret = torch.cat([input_tensor, xx_channel, yy_channel], dim=1)
+        if self.with_r:
+            rr = torch.sqrt(torch.pow(xx_channel, 2) + torch.pow(yy_channel, 2))
+            rr = rr / torch.max(rr)
+            ret = torch.cat([ret, rr], dim=1)
+        if self.with_boundary and type(heatmap) != type(None):
+            ret = torch.cat([ret, xx_boundary_channel,
+                             yy_boundary_channel], dim=1)
+        return ret
+class CoordConvTh(nn.Module):
+    """CoordConv layer as in the paper."""
+    def __init__(self, x_dim, y_dim, with_r, with_boundary,
+                 in_channels, first_one=False, *args, **kwargs):
+        super(CoordConvTh, self).__init__()
+        self.addcoords = AddCoordsTh(x_dim=x_dim, y_dim=y_dim, with_r=with_r,
+                                    with_boundary=with_boundary)
+        in_channels += 2
+        if with_r:
+            in_channels += 1
+        if with_boundary and not first_one:
+            in_channels += 2
+        self.conv = nn.Conv2d(in_channels=in_channels, *args, **kwargs)
+    def forward(self, input_tensor, heatmap=None):
+        ret = self.addcoords(input_tensor, heatmap)
+        last_channel = ret[:, -2:, :, :]
+        ret = self.conv(ret)
+        return ret, last_channel
+'''
+An alternative implementation for PyTorch with auto-infering the x-y dimensions.
+'''
+class AddCoords(nn.Module):
+    def __init__(self, with_r=False):
+        super().__init__()
+        self.with_r = with_r
+    def forward(self, input_tensor):
+        """
+        Args:
+            input_tensor: shape(batch, channel, x_dim, y_dim)
+        """
+        batch_size, _, x_dim, y_dim = input_tensor.size()
+        xx_channel = torch.arange(x_dim).repeat(1, y_dim, 1)
+        yy_channel = torch.arange(y_dim).repeat(1, x_dim, 1).transpose(1, 2)
+        xx_channel = xx_channel / (x_dim - 1)
+        yy_channel = yy_channel / (y_dim - 1)
+        xx_channel = xx_channel * 2 - 1
+        yy_channel = yy_channel * 2 - 1
+        xx_channel = xx_channel.repeat(batch_size, 1, 1, 1).transpose(2, 3)
+        yy_channel = yy_channel.repeat(batch_size, 1, 1, 1).transpose(2, 3)
+        if input_tensor.is_cuda:
+            xx_channel = xx_channel.to(device)
+            yy_channel = yy_channel.to(device)
+        ret = torch.cat([
+            input_tensor,
+            xx_channel.type_as(input_tensor),
+            yy_channel.type_as(input_tensor)], dim=1)
+        if self.with_r:
+            rr = torch.sqrt(torch.pow(xx_channel - 0.5, 2) + torch.pow(yy_channel - 0.5, 2))
+            if input_tensor.is_cuda:
+                rr = rr.to(device)
+            ret = torch.cat([ret, rr], dim=1)
+        return ret
+class CoordConv(nn.Module):
+    def __init__(self, in_channels, out_channels, with_r=False, **kwargs):
+        super().__init__()
+        self.addcoords = AddCoords(with_r=with_r)
+        self.conv = nn.Conv2d(in_channels + 2, out_channels, **kwargs)
+    def forward(self, x):
+        ret = self.addcoords(x)
+        ret = self.conv(ret)
+        return ret

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/dataloader.py ADDED Viewed

	@@ -0,0 +1,368 @@

+import sys
+import os
+import random
+import glob
+import torch
+from skimage import io
+from skimage import transform as ski_transform
+from skimage.color import rgb2gray
+import scipy.io as sio
+from scipy import interpolate
+import numpy as np
+import matplotlib.pyplot as plt
+from torch.utils.data import Dataset, DataLoader
+from torchvision import transforms, utils
+from torchvision.transforms import Lambda, Compose
+from torchvision.transforms.functional import adjust_brightness, adjust_contrast, adjust_saturation, adjust_hue
+from utils.utils import cv_crop, cv_rotate, draw_gaussian, transform, power_transform, shuffle_lr, fig2data, generate_weight_map
+from PIL import Image
+import cv2
+import copy
+import math
+from imgaug import augmenters as iaa
+class AddBoundary(object):
+    def __init__(self, num_landmarks=68):
+        self.num_landmarks = num_landmarks
+    def __call__(self, sample):
+        landmarks_64 = np.floor(sample['landmarks'] / 4.0)
+        if self.num_landmarks == 68:
+            boundaries = {}
+            boundaries['cheek'] = landmarks_64[0:17]
+            boundaries['left_eyebrow'] = landmarks_64[17:22]
+            boundaries['right_eyebrow'] = landmarks_64[22:27]
+            boundaries['uper_left_eyelid'] = landmarks_64[36:40]
+            boundaries['lower_left_eyelid'] = np.array([landmarks_64[i] for i in [36, 41, 40, 39]])
+            boundaries['upper_right_eyelid'] = landmarks_64[42:46]
+            boundaries['lower_right_eyelid'] = np.array([landmarks_64[i] for i in [42, 47, 46, 45]])
+            boundaries['noise'] = landmarks_64[27:31]
+            boundaries['noise_bot'] = landmarks_64[31:36]
+            boundaries['upper_outer_lip'] = landmarks_64[48:55]
+            boundaries['upper_inner_lip'] = np.array([landmarks_64[i] for i in [60, 61, 62, 63, 64]])
+            boundaries['lower_outer_lip'] = np.array([landmarks_64[i] for i in [48, 59, 58, 57, 56, 55, 54]])
+            boundaries['lower_inner_lip'] = np.array([landmarks_64[i] for i in [60, 67, 66, 65, 64]])
+        elif self.num_landmarks == 98:
+            boundaries = {}
+            boundaries['cheek'] = landmarks_64[0:33]
+            boundaries['left_eyebrow'] = landmarks_64[33:38]
+            boundaries['right_eyebrow'] = landmarks_64[42:47]
+            boundaries['uper_left_eyelid'] = landmarks_64[60:65]
+            boundaries['lower_left_eyelid'] = np.array([landmarks_64[i] for i in [60, 67, 66, 65, 64]])
+            boundaries['upper_right_eyelid'] = landmarks_64[68:73]
+            boundaries['lower_right_eyelid'] = np.array([landmarks_64[i] for i in [68, 75, 74, 73, 72]])
+            boundaries['noise'] = landmarks_64[51:55]
+            boundaries['noise_bot'] = landmarks_64[55:60]
+            boundaries['upper_outer_lip'] = landmarks_64[76:83]
+            boundaries['upper_inner_lip'] = np.array([landmarks_64[i] for i in [88, 89, 90, 91, 92]])
+            boundaries['lower_outer_lip'] = np.array([landmarks_64[i] for i in [76, 87, 86, 85, 84, 83, 82]])
+            boundaries['lower_inner_lip'] = np.array([landmarks_64[i] for i in [88, 95, 94, 93, 92]])
+        elif self.num_landmarks == 19:
+            boundaries = {}
+            boundaries['left_eyebrow'] = landmarks_64[0:3]
+            boundaries['right_eyebrow'] = landmarks_64[3:5]
+            boundaries['left_eye'] = landmarks_64[6:9]
+            boundaries['right_eye'] = landmarks_64[9:12]
+            boundaries['noise'] = landmarks_64[12:15]
+        elif self.num_landmarks == 29:
+            boundaries = {}
+            boundaries['upper_left_eyebrow'] = np.stack([
+                landmarks_64[0],
+                landmarks_64[4],
+                landmarks_64[2]
+            ], axis=0)
+            boundaries['lower_left_eyebrow'] = np.stack([
+                landmarks_64[0],
+                landmarks_64[5],
+                landmarks_64[2]
+            ], axis=0)
+            boundaries['upper_right_eyebrow'] = np.stack([
+                landmarks_64[1],
+                landmarks_64[6],
+                landmarks_64[3]
+            ], axis=0)
+            boundaries['lower_right_eyebrow'] = np.stack([
+                landmarks_64[1],
+                landmarks_64[7],
+                landmarks_64[3]
+            ], axis=0)
+            boundaries['upper_left_eye'] = np.stack([
+                landmarks_64[8],
+                landmarks_64[12],
+                landmarks_64[10]
+            ], axis=0)
+            boundaries['lower_left_eye'] = np.stack([
+                landmarks_64[8],
+                landmarks_64[13],
+                landmarks_64[10]
+            ], axis=0)
+            boundaries['upper_right_eye'] = np.stack([
+                landmarks_64[9],
+                landmarks_64[14],
+                landmarks_64[11]
+            ], axis=0)
+            boundaries['lower_right_eye'] = np.stack([
+                landmarks_64[9],
+                landmarks_64[15],
+                landmarks_64[11]
+            ], axis=0)
+            boundaries['noise'] = np.stack([
+                landmarks_64[18],
+                landmarks_64[21],
+                landmarks_64[19]
+            ], axis=0)
+            boundaries['outer_upper_lip'] = np.stack([
+                landmarks_64[22],
+                landmarks_64[24],
+                landmarks_64[23]
+            ], axis=0)
+            boundaries['inner_upper_lip'] = np.stack([
+                landmarks_64[22],
+                landmarks_64[25],
+                landmarks_64[23]
+            ], axis=0)
+            boundaries['outer_lower_lip'] = np.stack([
+                landmarks_64[22],
+                landmarks_64[26],
+                landmarks_64[23]
+            ], axis=0)
+            boundaries['inner_lower_lip'] = np.stack([
+                landmarks_64[22],
+                landmarks_64[27],
+                landmarks_64[23]
+            ], axis=0)
+        functions = {}
+        for key, points in boundaries.items():
+            temp = points[0]
+            new_points = points[0:1, :]
+            for point in points[1:]:
+                if point[0] == temp[0] and point[1] == temp[1]:
+                    continue
+                else:
+                    new_points = np.concatenate((new_points, np.expand_dims(point, 0)), axis=0)
+                    temp = point
+            points = new_points
+            if points.shape[0] == 1:
+                points = np.concatenate((points, points+0.001), axis=0)
+            k = min(4, points.shape[0])
+            functions[key] = interpolate.splprep([points[:, 0], points[:, 1]], k=k-1,s=0)
+        boundary_map = np.zeros((64, 64))
+        fig = plt.figure(figsize=[64/96.0, 64/96.0], dpi=96)
+        ax = fig.add_axes([0, 0, 1, 1])
+        ax.axis('off')
+        ax.imshow(boundary_map, interpolation='nearest', cmap='gray')
+        #ax.scatter(landmarks[:, 0], landmarks[:, 1], s=1, marker=',', c='w')
+        for key in functions.keys():
+            xnew = np.arange(0, 1, 0.01)
+            out = interpolate.splev(xnew, functions[key][0], der=0)
+            plt.plot(out[0], out[1], ',', linewidth=1, color='w')
+        img = fig2data(fig)
+        plt.close()
+        sigma = 1
+        temp = 255-img[:,:,1]
+        temp = cv2.distanceTransform(temp, cv2.DIST_L2, cv2.DIST_MASK_PRECISE)
+        temp = temp.astype(np.float32)
+        temp = np.where(temp < 3*sigma, np.exp(-(temp*temp)/(2*sigma*sigma)), 0 )
+        fig = plt.figure(figsize=[64/96.0, 64/96.0], dpi=96)
+        ax = fig.add_axes([0, 0, 1, 1])
+        ax.axis('off')
+        ax.imshow(temp, cmap='gray')
+        plt.close()
+        boundary_map = fig2data(fig)
+        sample['boundary'] = boundary_map[:, :, 0]
+        return sample
+class AddWeightMap(object):
+    def __call__(self, sample):
+        heatmap= sample['heatmap']
+        boundary = sample['boundary']
+        heatmap = np.concatenate((heatmap, np.expand_dims(boundary, axis=0)), 0)
+        weight_map = np.zeros_like(heatmap)
+        for i in range(heatmap.shape[0]):
+            weight_map[i] = generate_weight_map(weight_map[i],
+                                                heatmap[i])
+        sample['weight_map'] = weight_map
+        return sample
+class ToTensor(object):
+    """Convert ndarrays in sample to Tensors."""
+    def __call__(self, sample):
+        image, heatmap, landmarks, boundary, weight_map= sample['image'], sample['heatmap'], sample['landmarks'], sample['boundary'], sample['weight_map']
+        # swap color axis because
+        # numpy image: H x W x C
+        # torch image: C X H X W
+        if len(image.shape) == 2:
+            image = np.expand_dims(image, axis=2)
+            image_small = np.expand_dims(image_small, axis=2)
+        image = image.transpose((2, 0, 1))
+        boundary = np.expand_dims(boundary, axis=2)
+        boundary = boundary.transpose((2, 0, 1))
+        return {'image': torch.from_numpy(image).float().div(255.0),
+                'heatmap': torch.from_numpy(heatmap).float(),
+                'landmarks': torch.from_numpy(landmarks).float(),
+                'boundary': torch.from_numpy(boundary).float().div(255.0),
+                'weight_map': torch.from_numpy(weight_map).float()}
+class FaceLandmarksDataset(Dataset):
+    """Face Landmarks dataset."""
+    def __init__(self, img_dir, landmarks_dir, num_landmarks=68, gray_scale=False,
+                 detect_face=False, enhance=False, center_shift=0,
+                 transform=None,):
+        """
+        Args:
+            landmark_dir (string): Path to the mat file with landmarks saved.
+            img_dir (string): Directory with all the images.
+            transform (callable, optional): Optional transform to be applied
+                on a sample.
+        """
+        self.img_dir = img_dir
+        self.landmarks_dir = landmarks_dir
+        self.num_lanmdkars = num_landmarks
+        self.transform = transform
+        self.img_names = glob.glob(self.img_dir+'*.jpg') + \
+                         glob.glob(self.img_dir+'*.png')
+        self.gray_scale = gray_scale
+        self.detect_face = detect_face
+        self.enhance = enhance
+        self.center_shift = center_shift
+        if self.detect_face:
+            self.face_detector = MTCNN(thresh=[0.5, 0.6, 0.7])
+    def __len__(self):
+        return len(self.img_names)
+    def __getitem__(self, idx):
+        img_name = self.img_names[idx]
+        pil_image = Image.open(img_name)
+        if pil_image.mode != "RGB":
+            # if input is grayscale image, convert it to 3 channel image
+            if self.enhance:
+                pil_image = power_transform(pil_image, 0.5)
+            temp_image = Image.new('RGB', pil_image.size)
+            temp_image.paste(pil_image)
+            pil_image = temp_image
+        image = np.array(pil_image)
+        if self.gray_scale:
+            image = rgb2gray(image)
+            image = np.expand_dims(image, axis=2)
+            image = np.concatenate((image, image, image), axis=2)
+            image = image * 255.0
+            image = image.astype(np.uint8)
+        if not self.detect_face:
+            center = [450//2, 450//2+0]
+            if self.center_shift != 0:
+                center[0] += int(np.random.uniform(-self.center_shift,
+                                               self.center_shift))
+                center[1] += int(np.random.uniform(-self.center_shift,
+                                               self.center_shift))
+            scale = 1.8
+        else:
+            detected_faces = self.face_detector.detect_image(image)
+            if len(detected_faces) > 0:
+                box = detected_faces[0]
+                left, top, right, bottom, _ = box
+                center = [right - (right - left) / 2.0,
+                        bottom - (bottom - top) / 2.0]
+                center[1] = center[1] - (bottom - top) * 0.12
+                scale = (right - left + bottom - top) / 195.0
+            else:
+                center = [450//2, 450//2+0]
+                scale = 1.8
+            if self.center_shift != 0:
+                shift = self.center * self.center_shift / 450
+                center[0] += int(np.random.uniform(-shift, shift))
+                center[1] += int(np.random.uniform(-shift, shift))
+        base_name = os.path.basename(img_name)
+        landmarks_base_name = base_name[:-4] + '_pts.mat'
+        landmarks_name = os.path.join(self.landmarks_dir, landmarks_base_name)
+        if os.path.isfile(landmarks_name):
+            mat_data = sio.loadmat(landmarks_name)
+            landmarks = mat_data['pts_2d']
+        elif os.path.isfile(landmarks_name[:-8] + '.pts.npy'):
+            landmarks = np.load(landmarks_name[:-8] + '.pts.npy')
+        else:
+            landmarks = []
+            heatmap = []
+        if landmarks != []:
+            new_image, new_landmarks = cv_crop(image, landmarks, center,
+                                               scale, 256, self.center_shift)
+            tries = 0
+            while self.center_shift != 0 and tries < 5 and (np.max(new_landmarks) > 240 or np.min(new_landmarks) < 15):
+                center = [450//2, 450//2+0]
+                scale += 0.05
+                center[0] += int(np.random.uniform(-self.center_shift,
+                                            self.center_shift))
+                center[1] += int(np.random.uniform(-self.center_shift,
+                                            self.center_shift))
+                new_image, new_landmarks = cv_crop(image, landmarks,
+                                                    center, scale, 256,
+                                                    self.center_shift)
+                tries += 1
+            if np.max(new_landmarks) > 250 or np.min(new_landmarks) < 5:
+                center = [450//2, 450//2+0]
+                scale = 2.25
+                new_image, new_landmarks = cv_crop(image, landmarks,
+                                                    center, scale, 256,
+                                                    100)
+            assert (np.min(new_landmarks) > 0 and np.max(new_landmarks) < 256), \
+                "Landmarks out of boundary!"
+            image = new_image
+            landmarks = new_landmarks
+            heatmap = np.zeros((self.num_lanmdkars, 64, 64))
+            for i in range(self.num_lanmdkars):
+                if landmarks[i][0] > 0:
+                    heatmap[i] = draw_gaussian(heatmap[i], landmarks[i]/4.0+1, 1)
+        sample = {'image': image, 'heatmap': heatmap, 'landmarks': landmarks}
+        if self.transform:
+            sample = self.transform(sample)
+        return sample
+def get_dataset(val_img_dir, val_landmarks_dir, batch_size,
+                num_landmarks=68, rotation=0, scale=0,
+                center_shift=0, random_flip=False,
+                brightness=0, contrast=0, saturation=0,
+                blur=False, noise=False, jpeg_effect=False,
+                random_occlusion=False, gray_scale=False,
+                detect_face=False, enhance=False):
+    val_transforms = transforms.Compose([AddBoundary(num_landmarks),
+                                         AddWeightMap(),
+                                         ToTensor()])
+    val_dataset = FaceLandmarksDataset(val_img_dir, val_landmarks_dir,
+                                       num_landmarks=num_landmarks,
+                                       gray_scale=gray_scale,
+                                       detect_face=detect_face,
+                                       enhance=enhance,
+                                       transform=val_transforms)
+    val_dataloader = torch.utils.data.DataLoader(val_dataset,
+                                                   batch_size=batch_size,
+                                                   shuffle=False,
+                                                   num_workers=6)
+    data_loaders = {'val': val_dataloader}
+    dataset_sizes = {}
+    dataset_sizes['val'] = len(val_dataset)
+    return data_loaders, dataset_sizes

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/evaler.py ADDED Viewed

	@@ -0,0 +1,151 @@

+import matplotlib
+matplotlib.use('Agg')
+import math
+import torch
+import copy
+import time
+from torch.autograd import Variable
+import shutil
+from skimage import io
+import numpy as np
+from utils.utils import fan_NME, show_landmarks, get_preds_fromhm
+from PIL import Image, ImageDraw
+import os
+import sys
+import cv2
+import matplotlib.pyplot as plt
+device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+def eval_model(model, dataloaders, dataset_sizes,
+               writer, use_gpu=True, epoches=5, dataset='val',
+               save_path='./', num_landmarks=68):
+    global_nme = 0
+    model.eval()
+    for epoch in range(epoches):
+        running_loss = 0
+        step = 0
+        total_nme = 0
+        total_count = 0
+        fail_count = 0
+        nmes = []
+        # running_corrects = 0
+        # Iterate over data.
+        with torch.no_grad():
+            for data in dataloaders[dataset]:
+                total_runtime = 0
+                run_count = 0
+                step_start = time.time()
+                step += 1
+                # get the inputs
+                inputs = data['image'].type(torch.FloatTensor)
+                labels_heatmap = data['heatmap'].type(torch.FloatTensor)
+                labels_boundary = data['boundary'].type(torch.FloatTensor)
+                landmarks = data['landmarks'].type(torch.FloatTensor)
+                loss_weight_map = data['weight_map'].type(torch.FloatTensor)
+                # wrap them in Variable
+                if use_gpu:
+                    inputs = inputs.to(device)
+                    labels_heatmap = labels_heatmap.to(device)
+                    labels_boundary = labels_boundary.to(device)
+                    loss_weight_map = loss_weight_map.to(device)
+                else:
+                    inputs, labels_heatmap = Variable(inputs), Variable(labels_heatmap)
+                    labels_boundary = Variable(labels_boundary)
+                labels = torch.cat((labels_heatmap, labels_boundary), 1)
+                single_start = time.time()
+                outputs, boundary_channels = model(inputs)
+                single_end = time.time()
+                total_runtime += time.time() - single_start
+                run_count += 1
+                step_end = time.time()
+                for i in range(inputs.shape[0]):
+                    print(inputs.shape)
+                    img = inputs[i]
+                    img = img.cpu().numpy()
+                    img = img.transpose((1, 2, 0)) #*255.0
+                    # img = img.astype(np.uint8)
+                    # img = Image.fromarray(img)
+                    # pred_heatmap = outputs[-1][i].detach().cpu()[:-1, :, :]
+                    pred_heatmap = outputs[-1][:, :-1, :, :][i].detach().cpu()
+                    pred_landmarks, _ = get_preds_fromhm(pred_heatmap.unsqueeze(0))
+                    pred_landmarks = pred_landmarks.squeeze().numpy()
+                    gt_landmarks = data['landmarks'][i].numpy()
+                    print(pred_landmarks, gt_landmarks)
+                    import cv2
+                    while(True):
+                        imgshow = vis_landmark_on_img(cv2.UMat(img), pred_landmarks*4)
+                        cv2.imshow('img', imgshow)
+                        if(cv2.waitKey(10) == ord('q')):
+                            break
+                    if num_landmarks == 68:
+                        left_eye = np.average(gt_landmarks[36:42], axis=0)
+                        right_eye = np.average(gt_landmarks[42:48], axis=0)
+                        norm_factor = np.linalg.norm(left_eye - right_eye)
+                        # norm_factor = np.linalg.norm(gt_landmarks[36]- gt_landmarks[45])
+                    elif num_landmarks == 98:
+                        norm_factor = np.linalg.norm(gt_landmarks[60]- gt_landmarks[72])
+                    elif num_landmarks == 19:
+                        left, top = gt_landmarks[-2, :]
+                        right, bottom = gt_landmarks[-1, :]
+                        norm_factor = math.sqrt(abs(right - left)*abs(top-bottom))
+                        gt_landmarks = gt_landmarks[:-2, :]
+                    elif num_landmarks == 29:
+                        # norm_factor = np.linalg.norm(gt_landmarks[8]- gt_landmarks[9])
+                        norm_factor = np.linalg.norm(gt_landmarks[16]- gt_landmarks[17])
+                    single_nme = (np.sum(np.linalg.norm(pred_landmarks*4 - gt_landmarks, axis=1)) / pred_landmarks.shape[0]) / norm_factor
+                    nmes.append(single_nme)
+                    total_count += 1
+                    if single_nme > 0.1:
+                        fail_count += 1
+                if step % 10 == 0:
+                    print('Step {} Time: {:.6f} Input Mean: {:.6f} Output Mean: {:.6f}'.format(
+                        step, step_end - step_start,
+                        torch.mean(labels),
+                        torch.mean(outputs[0])))
+                # gt_landmarks = landmarks.numpy()
+                # pred_heatmap = outputs[-1].to('cpu').numpy()
+                gt_landmarks = landmarks
+                batch_nme = fan_NME(outputs[-1][:, :-1, :, :].detach().cpu(), gt_landmarks, num_landmarks)
+                # batch_nme = 0
+                total_nme += batch_nme
+        epoch_nme = total_nme / dataset_sizes['val']
+        global_nme += epoch_nme
+        nme_save_path = os.path.join(save_path, 'nme_log.npy')
+        np.save(nme_save_path, np.array(nmes))
+        print('NME: {:.6f} Failure Rate: {:.6f} Total Count: {:.6f} Fail Count: {:.6f}'.format(epoch_nme, fail_count/total_count, total_count, fail_count))
+    print('Evaluation done! Average NME: {:.6f}'.format(global_nme/epoches))
+    print('Everage runtime for a single batch: {:.6f}'.format(total_runtime/run_count))
+    return model
+def vis_landmark_on_img(img, shape, linewidth=2):
+    '''
+    Visualize landmark on images.
+    '''
+    def draw_curve(idx_list, color=(0, 255, 0), loop=False, lineWidth=linewidth):
+        for i in idx_list:
+            cv2.line(img, (shape[i, 0], shape[i, 1]), (shape[i + 1, 0], shape[i + 1, 1]), color, lineWidth)
+        if (loop):
+            cv2.line(img, (shape[idx_list[0], 0], shape[idx_list[0], 1]),
+                     (shape[idx_list[-1] + 1, 0], shape[idx_list[-1] + 1, 1]), color, lineWidth)
+    draw_curve(list(range(0, 32)))  # jaw
+    draw_curve(list(range(33, 41)), color=(0, 0, 255), loop=True)  # eye brow
+    draw_curve(list(range(42, 50)), color=(0, 0, 255), loop=True)
+    draw_curve(list(range(51, 59)))  # nose
+    draw_curve(list(range(60, 67)), loop=True)  # eyes
+    draw_curve(list(range(68, 75)), loop=True)
+    draw_curve(list(range(76, 87)), loop=True, color=(0, 255, 255))  # mouth
+    draw_curve(list(range(88, 95)), loop=True, color=(255, 255, 0))
+    return img

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/core/models.py ADDED Viewed

	@@ -0,0 +1,228 @@

+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import math
+from core.coord_conv import CoordConvTh
+def conv3x3(in_planes, out_planes, strd=1, padding=1,
+            bias=False,dilation=1):
+    "3x3 convolution with padding"
+    return nn.Conv2d(in_planes, out_planes, kernel_size=3,
+                     stride=strd, padding=padding, bias=bias,
+                     dilation=dilation)
+class BasicBlock(nn.Module):
+    expansion = 1
+    def __init__(self, inplanes, planes, stride=1, downsample=None):
+        super(BasicBlock, self).__init__()
+        self.conv1 = conv3x3(inplanes, planes, stride)
+        # self.bn1 = nn.BatchNorm2d(planes)
+        self.relu = nn.ReLU(inplace=True)
+        self.conv2 = conv3x3(planes, planes)
+        # self.bn2 = nn.BatchNorm2d(planes)
+        self.downsample = downsample
+        self.stride = stride
+    def forward(self, x):
+        residual = x
+        out = self.conv1(x)
+        # out = self.bn1(out)
+        out = self.relu(out)
+        out = self.conv2(out)
+        # out = self.bn2(out)
+        if self.downsample is not None:
+            residual = self.downsample(x)
+        out += residual
+        out = self.relu(out)
+        return out
+class ConvBlock(nn.Module):
+    def __init__(self, in_planes, out_planes):
+        super(ConvBlock, self).__init__()
+        self.bn1 = nn.BatchNorm2d(in_planes)
+        self.conv1 = conv3x3(in_planes, int(out_planes / 2))
+        self.bn2 = nn.BatchNorm2d(int(out_planes / 2))
+        self.conv2 = conv3x3(int(out_planes / 2), int(out_planes / 4),
+                             padding=1, dilation=1)
+        self.bn3 = nn.BatchNorm2d(int(out_planes / 4))
+        self.conv3 = conv3x3(int(out_planes / 4), int(out_planes / 4),
+                             padding=1, dilation=1)
+        if in_planes != out_planes:
+            self.downsample = nn.Sequential(
+                nn.BatchNorm2d(in_planes),
+                nn.ReLU(True),
+                nn.Conv2d(in_planes, out_planes,
+                          kernel_size=1, stride=1, bias=False),
+            )
+        else:
+            self.downsample = None
+    def forward(self, x):
+        residual = x
+        out1 = self.bn1(x)
+        out1 = F.relu(out1, True)
+        out1 = self.conv1(out1)
+        out2 = self.bn2(out1)
+        out2 = F.relu(out2, True)
+        out2 = self.conv2(out2)
+        out3 = self.bn3(out2)
+        out3 = F.relu(out3, True)
+        out3 = self.conv3(out3)
+        out3 = torch.cat((out1, out2, out3), 1)
+        if self.downsample is not None:
+            residual = self.downsample(residual)
+        out3 += residual
+        return out3
+class HourGlass(nn.Module):
+    def __init__(self, num_modules, depth, num_features, first_one=False):
+        super(HourGlass, self).__init__()
+        self.num_modules = num_modules
+        self.depth = depth
+        self.features = num_features
+        self.coordconv = CoordConvTh(x_dim=64, y_dim=64,
+                                     with_r=True, with_boundary=True,
+                                     in_channels=256, first_one=first_one,
+                                     out_channels=256,
+                                     kernel_size=1,
+                                     stride=1, padding=0)
+        self._generate_network(self.depth)
+    def _generate_network(self, level):
+        self.add_module('b1_' + str(level), ConvBlock(256, 256))
+        self.add_module('b2_' + str(level), ConvBlock(256, 256))
+        if level > 1:
+            self._generate_network(level - 1)
+        else:
+            self.add_module('b2_plus_' + str(level), ConvBlock(256, 256))
+        self.add_module('b3_' + str(level), ConvBlock(256, 256))
+    def _forward(self, level, inp):
+        # Upper branch
+        up1 = inp
+        up1 = self._modules['b1_' + str(level)](up1)
+        # Lower branch
+        low1 = F.avg_pool2d(inp, 2, stride=2)
+        low1 = self._modules['b2_' + str(level)](low1)
+        if level > 1:
+            low2 = self._forward(level - 1, low1)
+        else:
+            low2 = low1
+            low2 = self._modules['b2_plus_' + str(level)](low2)
+        low3 = low2
+        low3 = self._modules['b3_' + str(level)](low3)
+        up2 = F.upsample(low3, scale_factor=2, mode='nearest')
+        return up1 + up2
+    def forward(self, x, heatmap):
+        x, last_channel = self.coordconv(x, heatmap)
+        return self._forward(self.depth, x), last_channel
+class FAN(nn.Module):
+    def __init__(self, num_modules=1, end_relu=False, gray_scale=False,
+                 num_landmarks=68):
+        super(FAN, self).__init__()
+        self.num_modules = num_modules
+        self.gray_scale = gray_scale
+        self.end_relu = end_relu
+        self.num_landmarks = num_landmarks
+        # Base part
+        if self.gray_scale:
+            self.conv1 = CoordConvTh(x_dim=256, y_dim=256,
+                                     with_r=True, with_boundary=False,
+                                     in_channels=3, out_channels=64,
+                                     kernel_size=7,
+                                     stride=2, padding=3)
+        else:
+            self.conv1 = CoordConvTh(x_dim=256, y_dim=256,
+                                     with_r=True, with_boundary=False,
+                                     in_channels=3, out_channels=64,
+                                     kernel_size=7,
+                                     stride=2, padding=3)
+        self.bn1 = nn.BatchNorm2d(64)
+        self.conv2 = ConvBlock(64, 128)
+        self.conv3 = ConvBlock(128, 128)
+        self.conv4 = ConvBlock(128, 256)
+        # Stacking part
+        for hg_module in range(self.num_modules):
+            if hg_module == 0:
+                first_one = True
+            else:
+                first_one = False
+            self.add_module('m' + str(hg_module), HourGlass(1, 4, 256,
+                                                            first_one))
+            self.add_module('top_m_' + str(hg_module), ConvBlock(256, 256))
+            self.add_module('conv_last' + str(hg_module),
+                            nn.Conv2d(256, 256, kernel_size=1, stride=1, padding=0))
+            self.add_module('bn_end' + str(hg_module), nn.BatchNorm2d(256))
+            self.add_module('l' + str(hg_module), nn.Conv2d(256,
+                                                            num_landmarks+1, kernel_size=1, stride=1, padding=0))
+            if hg_module < self.num_modules - 1:
+                self.add_module(
+                    'bl' + str(hg_module), nn.Conv2d(256, 256, kernel_size=1, stride=1, padding=0))
+                self.add_module('al' + str(hg_module), nn.Conv2d(num_landmarks+1,
+                                                                 256, kernel_size=1, stride=1, padding=0))
+    def forward(self, x):
+        x, _ = self.conv1(x)
+        x = F.relu(self.bn1(x), True)
+        # x = F.relu(self.bn1(self.conv1(x)), True)
+        x = F.avg_pool2d(self.conv2(x), 2, stride=2)
+        x = self.conv3(x)
+        x = self.conv4(x)
+        previous = x
+        outputs = []
+        boundary_channels = []
+        tmp_out = None
+        for i in range(self.num_modules):
+            hg, boundary_channel = self._modules['m' + str(i)](previous,
+                                                               tmp_out)
+            ll = hg
+            ll = self._modules['top_m_' + str(i)](ll)
+            ll = F.relu(self._modules['bn_end' + str(i)]
+                        (self._modules['conv_last' + str(i)](ll)), True)
+            # Predict heatmaps
+            tmp_out = self._modules['l' + str(i)](ll)
+            if self.end_relu:
+                tmp_out = F.relu(tmp_out) # HACK: Added relu
+            outputs.append(tmp_out)
+            boundary_channels.append(boundary_channel)
+            if i < self.num_modules - 1:
+                ll = self._modules['bl' + str(i)](ll)
+                tmp_out_ = self._modules['al' + str(i)](tmp_out)
+                previous = previous + ll + tmp_out_
+        return outputs, boundary_channels

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/eval.py ADDED Viewed

	@@ -0,0 +1,77 @@

+from __future__ import print_function, division
+import torch
+import argparse
+import numpy as np
+import torch.nn as nn
+import time
+import os
+from core.evaler import eval_model
+from core.dataloader import get_dataset
+from core import models
+from tensorboardX import SummaryWriter
+# Parse arguments
+parser = argparse.ArgumentParser()
+# Dataset paths
+parser.add_argument('--val_img_dir', type=str,
+                    help='Validation image directory')
+parser.add_argument('--val_landmarks_dir', type=str,
+                    help='Validation landmarks directory')
+parser.add_argument('--num_landmarks', type=int, default=68,
+                    help='Number of landmarks')
+# Checkpoint and pretrained weights
+parser.add_argument('--ckpt_save_path', type=str,
+                    help='a directory to save checkpoint file')
+parser.add_argument('--pretrained_weights', type=str,
+                    help='a directory to save pretrained_weights')
+# Eval options
+parser.add_argument('--batch_size', type=int, default=25,
+                    help='learning rate decay after each epoch')
+# Network parameters
+parser.add_argument('--hg_blocks', type=int, default=4,
+                    help='Number of HG blocks to stack')
+parser.add_argument('--gray_scale', type=str, default="False",
+                    help='Whether to convert RGB image into gray scale during training')
+parser.add_argument('--end_relu', type=str, default="False",
+                    help='Whether to add relu at the end of each HG module')
+args = parser.parse_args()
+VAL_IMG_DIR = args.val_img_dir
+VAL_LANDMARKS_DIR = args.val_landmarks_dir
+CKPT_SAVE_PATH = args.ckpt_save_path
+BATCH_SIZE = args.batch_size
+PRETRAINED_WEIGHTS = args.pretrained_weights
+GRAY_SCALE = False if args.gray_scale == 'False' else True
+HG_BLOCKS = args.hg_blocks
+END_RELU = False if args.end_relu == 'False' else True
+NUM_LANDMARKS = args.num_landmarks
+device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+writer = SummaryWriter(CKPT_SAVE_PATH)
+dataloaders, dataset_sizes = get_dataset(VAL_IMG_DIR, VAL_LANDMARKS_DIR,
+                                         BATCH_SIZE, NUM_LANDMARKS)
+use_gpu = torch.cuda.is_available()
+model_ft = models.FAN(HG_BLOCKS, END_RELU, GRAY_SCALE, NUM_LANDMARKS)
+if PRETRAINED_WEIGHTS != "None":
+    checkpoint = torch.load(PRETRAINED_WEIGHTS)
+    if 'state_dict' not in checkpoint:
+        model_ft.load_state_dict(checkpoint)
+    else:
+        pretrained_weights = checkpoint['state_dict']
+        model_weights = model_ft.state_dict()
+        pretrained_weights = {k: v for k, v in pretrained_weights.items() \
+                              if k in model_weights}
+        model_weights.update(pretrained_weights)
+        model_ft.load_state_dict(model_weights)
+model_ft = model_ft.to(device)
+model_ft = eval_model(model_ft, dataloaders, dataset_sizes, writer, use_gpu, 1, 'val', CKPT_SAVE_PATH, NUM_LANDMARKS)

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/images/wflw.png ADDED Viewed

Git LFS Details

SHA256: 354babe46beeec86fc8a9f64c57a1dad0ec19ff23f455ac3405321bab473ce23
Pointer size: 132 Bytes
Size of remote file: 2.95 MB

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/images/wflw_table.png ADDED Viewed

Git LFS Details

SHA256: 87c9ea0af4854681b6fc5e911ac38042ca5099098146501f20b64a6457a9d98b
Pointer size: 132 Bytes
Size of remote file: 1.09 MB

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/requirements.txt ADDED Viewed

	@@ -0,0 +1,12 @@

+opencv-python
+scipy>=0.17.0
+scikit-image
+numpy
+matplotlib
+Pillow>=4.3.0
+imgaug
+tensorflow
+git+https://github.com/lanpa/tensorboardX
+joblib
+torch==1.3.0
+torchvision==0.4.1

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/scripts/eval_wflw.sh ADDED Viewed

	@@ -0,0 +1,10 @@

+CUDA_VISIBLE_DEVICES=1 python ../eval.py \
+                    --val_img_dir='../dataset/WFLW_test/images/' \
+                    --val_landmarks_dir='../dataset/WFLW_test/landmarks/' \
+                    --ckpt_save_path='../experiments/eval_iccv_0620' \
+                    --hg_blocks=4 \
+                    --pretrained_weights='../ckpt/WFLW_4HG.pth' \
+                    --num_landmarks=98 \
+                    --end_relu='False' \
+                    --batch_size=20 \

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/utils/__init__.py ADDED Viewed

File without changes

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/utils/__pycache__/__init__.cpython-37.pyc ADDED Viewed

Binary file (170 Bytes). View file

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/utils/__pycache__/__init__.cpython-39.pyc ADDED Viewed

Binary file (185 Bytes). View file

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/utils/__pycache__/utils.cpython-37.pyc ADDED Viewed

Binary file (11.8 kB). View file

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/utils/__pycache__/utils.cpython-39.pyc ADDED Viewed

Binary file (11.6 kB). View file

marlenezw/audio-driven-animations/MakeItTalk/AdaptiveWingLoss/utils/utils.py ADDED Viewed

	@@ -0,0 +1,354 @@

+from __future__ import print_function, division
+import os
+import sys
+import math
+import torch
+import cv2
+from PIL import Image
+from skimage import io
+from skimage import transform as ski_transform
+from scipy import ndimage
+import numpy as np
+import matplotlib
+import matplotlib.pyplot as plt
+from torch.utils.data import Dataset, DataLoader
+from torchvision import transforms, utils
+def _gaussian(
+        size=3, sigma=0.25, amplitude=1, normalize=False, width=None,
+        height=None, sigma_horz=None, sigma_vert=None, mean_horz=0.5,
+        mean_vert=0.5):
+    # handle some defaults
+    if width is None:
+        width = size
+    if height is None:
+        height = size
+    if sigma_horz is None:
+        sigma_horz = sigma
+    if sigma_vert is None:
+        sigma_vert = sigma
+    center_x = mean_horz * width + 0.5
+    center_y = mean_vert * height + 0.5
+    gauss = np.empty((height, width), dtype=np.float32)
+    # generate kernel
+    for i in range(height):
+        for j in range(width):
+            gauss[i][j] = amplitude * math.exp(-(math.pow((j + 1 - center_x) / (
+                sigma_horz * width), 2) / 2.0 + math.pow((i + 1 - center_y) / (sigma_vert * height), 2) / 2.0))
+    if normalize:
+        gauss = gauss / np.sum(gauss)
+    return gauss
+def draw_gaussian(image, point, sigma):
+    # Check if the gaussian is inside
+    ul = [np.floor(np.floor(point[0]) - 3 * sigma),
+          np.floor(np.floor(point[1]) - 3 * sigma)]
+    br = [np.floor(np.floor(point[0]) + 3 * sigma),
+          np.floor(np.floor(point[1]) + 3 * sigma)]
+    if (ul[0] > image.shape[1] or ul[1] >
+            image.shape[0] or br[0] < 1 or br[1] < 1):
+        return image
+    size = 6 * sigma + 1
+    g = _gaussian(size)
+    g_x = [int(max(1, -ul[0])), int(min(br[0], image.shape[1])) -
+           int(max(1, ul[0])) + int(max(1, -ul[0]))]
+    g_y = [int(max(1, -ul[1])), int(min(br[1], image.shape[0])) -
+           int(max(1, ul[1])) + int(max(1, -ul[1]))]
+    img_x = [int(max(1, ul[0])), int(min(br[0], image.shape[1]))]
+    img_y = [int(max(1, ul[1])), int(min(br[1], image.shape[0]))]
+    assert (g_x[0] > 0 and g_y[1] > 0)
+    correct = False
+    while not correct:
+        try:
+            image[img_y[0] - 1:img_y[1], img_x[0] - 1:img_x[1]
+            ] = image[img_y[0] - 1:img_y[1], img_x[0] - 1:img_x[1]] + g[g_y[0] - 1:g_y[1], g_x[0] - 1:g_x[1]]
+            correct = True
+        except:
+            print('img_x: {}, img_y: {}, g_x:{}, g_y:{}, point:{}, g_shape:{}, ul:{}, br:{}'.format(img_x, img_y, g_x, g_y, point, g.shape, ul, br))
+            ul = [np.floor(np.floor(point[0]) - 3 * sigma),
+                np.floor(np.floor(point[1]) - 3 * sigma)]
+            br = [np.floor(np.floor(point[0]) + 3 * sigma),
+                np.floor(np.floor(point[1]) + 3 * sigma)]
+            g_x = [int(max(1, -ul[0])), int(min(br[0], image.shape[1])) -
+                int(max(1, ul[0])) + int(max(1, -ul[0]))]
+            g_y = [int(max(1, -ul[1])), int(min(br[1], image.shape[0])) -
+                int(max(1, ul[1])) + int(max(1, -ul[1]))]
+            img_x = [int(max(1, ul[0])), int(min(br[0], image.shape[1]))]
+            img_y = [int(max(1, ul[1])), int(min(br[1], image.shape[0]))]
+            pass
+    image[image > 1] = 1
+    return image
+def transform(point, center, scale, resolution, rotation=0, invert=False):
+    _pt = np.ones(3)
+    _pt[0] = point[0]
+    _pt[1] = point[1]
+    h = 200.0 * scale
+    t = np.eye(3)
+    t[0, 0] = resolution / h
+    t[1, 1] = resolution / h
+    t[0, 2] = resolution * (-center[0] / h + 0.5)
+    t[1, 2] = resolution * (-center[1] / h + 0.5)
+    if rotation != 0:
+        rotation = -rotation
+        r = np.eye(3)
+        ang = rotation * math.pi / 180.0
+        s = math.sin(ang)
+        c = math.cos(ang)
+        r[0][0] = c
+        r[0][1] = -s
+        r[1][0] = s
+        r[1][1] = c
+        t_ = np.eye(3)
+        t_[0][2] = -resolution / 2.0
+        t_[1][2] = -resolution / 2.0
+        t_inv = torch.eye(3)
+        t_inv[0][2] = resolution / 2.0
+        t_inv[1][2] = resolution / 2.0
+        t = reduce(np.matmul, [t_inv, r, t_, t])
+    if invert:
+        t = np.linalg.inv(t)
+    new_point = (np.matmul(t, _pt))[0:2]
+    return new_point.astype(int)
+def cv_crop(image, landmarks, center, scale, resolution=256, center_shift=0):
+    new_image = cv2.copyMakeBorder(image, center_shift,
+                                   center_shift,
+                                   center_shift,
+                                   center_shift,
+                                   cv2.BORDER_CONSTANT, value=[0,0,0])
+    new_landmarks = landmarks.copy()
+    if center_shift != 0:
+        center[0] += center_shift
+        center[1] += center_shift
+        new_landmarks = new_landmarks + center_shift
+    length = 200 * scale
+    top = int(center[1] - length // 2)
+    bottom = int(center[1] + length // 2)
+    left = int(center[0] - length // 2)
+    right = int(center[0] + length // 2)
+    y_pad = abs(min(top, new_image.shape[0] - bottom, 0))
+    x_pad = abs(min(left, new_image.shape[1] - right, 0))
+    top, bottom, left, right = top + y_pad, bottom + y_pad, left + x_pad, right + x_pad
+    new_image = cv2.copyMakeBorder(new_image, y_pad,
+                                   y_pad,
+                                   x_pad,
+                                   x_pad,
+                                   cv2.BORDER_CONSTANT, value=[0,0,0])
+    new_image = new_image[top:bottom, left:right]
+    new_image = cv2.resize(new_image, dsize=(int(resolution), int(resolution)),
+                           interpolation=cv2.INTER_LINEAR)
+    new_landmarks[:, 0] = (new_landmarks[:, 0] + x_pad - left) * resolution / length
+    new_landmarks[:, 1] = (new_landmarks[:, 1] + y_pad - top) * resolution / length
+    return new_image, new_landmarks
+def cv_rotate(image, landmarks, heatmap, rot, scale, resolution=256):
+    img_mat = cv2.getRotationMatrix2D((resolution//2, resolution//2), rot, scale)
+    ones = np.ones(shape=(landmarks.shape[0], 1))
+    stacked_landmarks = np.hstack([landmarks, ones])
+    new_landmarks = img_mat.dot(stacked_landmarks.T).T
+    if np.max(new_landmarks) > 255 or np.min(new_landmarks) < 0:
+        return image, landmarks, heatmap
+    else:
+        new_image = cv2.warpAffine(image, img_mat, (resolution, resolution))
+        if heatmap is not None:
+            new_heatmap = np.zeros((heatmap.shape[0], 64, 64))
+            for i in range(heatmap.shape[0]):
+                if new_landmarks[i][0] > 0:
+                    new_heatmap[i] = draw_gaussian(new_heatmap[i],
+                                                   new_landmarks[i]/4.0+1, 1)
+        return new_image, new_landmarks, new_heatmap
+def show_landmarks(image, heatmap, gt_landmarks, gt_heatmap):
+    """Show image with pred_landmarks"""
+    pred_landmarks = []
+    pred_landmarks, _ = get_preds_fromhm(torch.from_numpy(heatmap).unsqueeze(0))
+    pred_landmarks = pred_landmarks.squeeze()*4
+    # pred_landmarks2 = get_preds_fromhm2(heatmap)
+    heatmap = np.max(gt_heatmap, axis=0)
+    heatmap = heatmap / np.max(heatmap)
+    # image = ski_transform.resize(image, (64, 64))*255
+    image = image.astype(np.uint8)
+    heatmap = np.max(gt_heatmap, axis=0)
+    heatmap = ski_transform.resize(heatmap, (image.shape[0], image.shape[1]))
+    heatmap *= 255
+    heatmap = heatmap.astype(np.uint8)
+    heatmap = cv2.applyColorMap(heatmap, cv2.COLORMAP_JET)
+    plt.imshow(image)
+    plt.scatter(gt_landmarks[:, 0], gt_landmarks[:, 1], s=0.5, marker='.', c='g')
+    plt.scatter(pred_landmarks[:, 0], pred_landmarks[:, 1], s=0.5, marker='.', c='r')
+    plt.pause(0.001)  # pause a bit so that plots are updated
+def fan_NME(pred_heatmaps, gt_landmarks, num_landmarks=68):
+    '''
+       Calculate total NME for a batch of data
+       Args:
+           pred_heatmaps: torch tensor of size [batch, points, height, width]
+           gt_landmarks: torch tesnsor of size [batch, points, x, y]
+       Returns:
+           nme: sum of nme for this batch
+    '''
+    nme = 0
+    pred_landmarks, _ = get_preds_fromhm(pred_heatmaps)
+    pred_landmarks = pred_landmarks.numpy()
+    gt_landmarks = gt_landmarks.numpy()
+    for i in range(pred_landmarks.shape[0]):
+        pred_landmark = pred_landmarks[i] * 4.0
+        gt_landmark = gt_landmarks[i]
+        if num_landmarks == 68:
+            left_eye = np.average(gt_landmark[36:42], axis=0)
+            right_eye = np.average(gt_landmark[42:48], axis=0)
+            norm_factor = np.linalg.norm(left_eye - right_eye)
+            # norm_factor = np.linalg.norm(gt_landmark[36]- gt_landmark[45])
+        elif num_landmarks == 98:
+            norm_factor = np.linalg.norm(gt_landmark[60]- gt_landmark[72])
+        elif num_landmarks == 19:
+            left, top = gt_landmark[-2, :]
+            right, bottom = gt_landmark[-1, :]
+            norm_factor = math.sqrt(abs(right - left)*abs(top-bottom))
+            gt_landmark = gt_landmark[:-2, :]
+        elif num_landmarks == 29:
+            # norm_factor = np.linalg.norm(gt_landmark[8]- gt_landmark[9])
+            norm_factor = np.linalg.norm(gt_landmark[16]- gt_landmark[17])
+        nme += (np.sum(np.linalg.norm(pred_landmark - gt_landmark, axis=1)) / pred_landmark.shape[0]) / norm_factor
+    return nme
+def fan_NME_hm(pred_heatmaps, gt_heatmaps, num_landmarks=68):
+    '''
+       Calculate total NME for a batch of data
+       Args:
+           pred_heatmaps: torch tensor of size [batch, points, height, width]
+           gt_landmarks: torch tesnsor of size [batch, points, x, y]
+       Returns:
+           nme: sum of nme for this batch
+    '''
+    nme = 0
+    pred_landmarks, _ = get_index_fromhm(pred_heatmaps)
+    pred_landmarks = pred_landmarks.numpy()
+    gt_landmarks = gt_landmarks.numpy()
+    for i in range(pred_landmarks.shape[0]):
+        pred_landmark = pred_landmarks[i] * 4.0
+        gt_landmark = gt_landmarks[i]
+        if num_landmarks == 68:
+            left_eye = np.average(gt_landmark[36:42], axis=0)
+            right_eye = np.average(gt_landmark[42:48], axis=0)
+            norm_factor = np.linalg.norm(left_eye - right_eye)
+        else:
+            norm_factor = np.linalg.norm(gt_landmark[60]- gt_landmark[72])
+        nme += (np.sum(np.linalg.norm(pred_landmark - gt_landmark, axis=1)) / pred_landmark.shape[0]) / norm_factor
+    return nme
+def power_transform(img, power):
+    img = np.array(img)
+    img_new = np.power((img/255.0), power) * 255.0
+    img_new = img_new.astype(np.uint8)
+    img_new = Image.fromarray(img_new)
+    return img_new
+def get_preds_fromhm(hm, center=None, scale=None, rot=None):
+    max, idx = torch.max(
+        hm.view(hm.size(0), hm.size(1), hm.size(2) * hm.size(3)), 2)
+    idx += 1
+    preds = idx.view(idx.size(0), idx.size(1), 1).repeat(1, 1, 2).float()
+    preds[..., 0].apply_(lambda x: (x - 1) % hm.size(3) + 1)
+    preds[..., 1].add_(-1).div_(hm.size(2)).floor_().add_(1)
+    for i in range(preds.size(0)):
+        for j in range(preds.size(1)):
+            hm_ = hm[i, j, :]
+            pX, pY = int(preds[i, j, 0]) - 1, int(preds[i, j, 1]) - 1
+            if pX > 0 and pX < 63 and pY > 0 and pY < 63:
+                diff = torch.FloatTensor(
+                    [hm_[pY, pX + 1] - hm_[pY, pX - 1],
+                     hm_[pY + 1, pX] - hm_[pY - 1, pX]])
+                preds[i, j].add_(diff.sign_().mul_(.25))
+    preds.add_(-0.5)
+    preds_orig = torch.zeros(preds.size())
+    if center is not None and scale is not None:
+        for i in range(hm.size(0)):
+            for j in range(hm.size(1)):
+                preds_orig[i, j] = transform(
+                    preds[i, j], center, scale, hm.size(2), rot, True)
+    return preds, preds_orig
+def get_index_fromhm(hm):
+    max, idx = torch.max(
+        hm.view(hm.size(0), hm.size(1), hm.size(2) * hm.size(3)), 2)
+    preds = idx.view(idx.size(0), idx.size(1), 1).repeat(1, 1, 2).float()
+    preds[..., 0].remainder_(hm.size(3))
+    preds[..., 1].div_(hm.size(2)).floor_()
+    for i in range(preds.size(0)):
+        for j in range(preds.size(1)):
+            hm_ = hm[i, j, :]
+            pX, pY = int(preds[i, j, 0]), int(preds[i, j, 1])
+            if pX > 0 and pX < 63 and pY > 0 and pY < 63:
+                diff = torch.FloatTensor(
+                    [hm_[pY, pX + 1] - hm_[pY, pX - 1],
+                     hm_[pY + 1, pX] - hm_[pY - 1, pX]])
+                preds[i, j].add_(diff.sign_().mul_(.25))
+    return preds
+def shuffle_lr(parts, num_landmarks=68, pairs=None):
+    if num_landmarks == 68:
+        if pairs is None:
+            pairs = [[0, 16], [1, 15], [2, 14], [3, 13], [4, 12], [5, 11], [6, 10],
+                    [7, 9], [17, 26], [18, 25], [19, 24], [20, 23], [21, 22], [36, 45],
+                    [37, 44], [38, 43], [39, 42], [41, 46], [40, 47], [31, 35], [32, 34],
+                    [50, 52], [49, 53], [48, 54], [61, 63], [60, 64], [67, 65], [59, 55], [58, 56]]
+    elif num_landmarks == 98:
+        if pairs is None:
+            pairs = [[0, 32], [1,31], [2, 30], [3, 29], [4, 28], [5, 27], [6, 26], [7, 25], [8, 24], [9, 23], [10, 22], [11, 21], [12, 20], [13, 19], [14, 18], [15, 17], [33, 46], [34, 45], [35, 44], [36, 43], [37, 42], [38, 50], [39, 49], [40, 48], [41, 47], [60, 72], [61, 71], [62, 70], [63, 69], [64, 68], [65, 75], [66, 74], [67, 73], [96, 97], [55, 59], [56, 58], [76, 82], [77, 81], [78, 80], [88, 92], [89, 91], [95, 93], [87, 83], [86, 84]]
+    elif num_landmarks == 19:
+        if pairs is None:
+            pairs = [[0, 5], [1, 4], [2, 3], [6, 11], [7, 10], [8, 9], [12, 14], [15, 17]]
+    elif num_landmarks == 29:
+        if pairs is None:
+            pairs = [[0, 1], [4, 6], [5, 7], [2, 3], [8, 9], [12, 14], [16, 17], [13, 15], [10, 11], [18, 19], [22, 23]]
+    for matched_p in pairs:
+        idx1, idx2 = matched_p[0], matched_p[1]
+        tmp = np.copy(parts[idx1])
+        np.copyto(parts[idx1], parts[idx2])
+        np.copyto(parts[idx2], tmp)
+    return parts
+def generate_weight_map(weight_map,heatmap):
+    k_size = 3
+    dilate = ndimage.grey_dilation(heatmap ,size=(k_size,k_size))
+    weight_map[np.where(dilate>0.2)] = 1
+    return weight_map
+def fig2data(fig):
+    """
+    @brief Convert a Matplotlib figure to a 4D numpy array with RGBA channels and return it
+    @param fig a matplotlib figure
+    @return a numpy 3D array of RGBA values
+    """
+    # draw the renderer
+    fig.canvas.draw ( )
+    # Get the RGB buffer from the figure
+    w,h = fig.canvas.get_width_height()
+    buf = np.fromstring (fig.canvas.tostring_rgb(), dtype=np.uint8)
+    buf.shape = (w, h, 3)
+    # canvas.tostring_argb give pixmap in ARGB mode. Roll the ALPHA channel to have it in RGBA mode
+    buf = np.roll (buf, 3, axis=2)
+    return buf

marlenezw/audio-driven-animations/MakeItTalk/__init__.py ADDED Viewed

File without changes

marlenezw/audio-driven-animations/MakeItTalk/__pycache__/__init__.cpython-37.pyc ADDED Viewed

Binary file (147 Bytes). View file

marlenezw/audio-driven-animations/MakeItTalk/__pycache__/__init__.cpython-39.pyc ADDED Viewed

Binary file (162 Bytes). View file

marlenezw/audio-driven-animations/MakeItTalk/face_of_art/CODEOWNERS ADDED Viewed

	@@ -0,0 +1 @@


1	+ * @papulke

marlenezw/audio-driven-animations/MakeItTalk/face_of_art/LICENCE.txt ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2019 Jordan Yaniv
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
+DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
+OR OTHER DEALINGS IN THE SOFTWARE.

marlenezw/audio-driven-animations/MakeItTalk/face_of_art/README.md ADDED Viewed

	@@ -0,0 +1,98 @@

+# The Face of Art: Landmark Detection and Geometric Style in Portraits
+Code for the landmark detection framework described in [The Face of Art: Landmark Detection and Geometric Style in Portraits](http://www.faculty.idc.ac.il/arik/site/foa/face-of-art.asp) (SIGGRAPH 2019)
+![](old/teaser.png)
+<sub><sup>Top: landmark detection results on artistic portraits with different styles allows to define the geometric style of an artist. Bottom: results of the style transfer of portraits using various artists' geometric style, including Amedeo Modigliani, Pablo Picasso, Margaret Keane, Fernand Léger, and Tsuguharu Foujita. Top right portrait is from 'Woman with Peanuts,' ©1962, Estate of Roy Lichtenstein.</sup></sub>
+## Getting Started
+### Requirements
+* python
+* anaconda
+### Download
+#### Model
+download model weights from [here](https://www.dropbox.com/sh/hrxcyug1bmbj6cs/AAAxq_zI5eawcLjM8zvUwaXha?dl=0).
+#### Datasets
+* The datasets used for training and evaluating our model can be found [here](https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/).
+* The Artistic-Faces dataset can be found [here](http://www.faculty.idc.ac.il/arik/site/foa/artistic-faces-dataset.asp).
+* Training images with texture augmentation can be found [here](https://www.dropbox.com/sh/av2k1i1082z0nie/AAC5qV1E2UkqpDLVsv7TazMta?dl=0).
+  before applying texture style transfer, the training images were cropped to the ground-truth face bounding-box with 25% margin. To crop training images, run the script `crop_training_set.py`.
+* our model expects the following directory structure of landmark detection datasets:
+```
+landmark_detection_datasets
+    ├── training
+    ├── test
+    ├── challenging
+    ├── common
+    ├── full
+    ├── crop_gt_margin_0.25 (cropped images of training set)
+    └── crop_gt_margin_0.25_ns (cropped images of training set + texture style transfer)
+```
+### Install
+Create a virtual environment and install the following:
+* opencv
+* menpo
+* menpofit
+* tensorflow-gpu
+for python 2:
+```
+conda create -n foa_env python=2.7 anaconda
+source activate foa_env
+conda install -c menpo opencv
+conda install -c menpo menpo
+conda install -c menpo menpofit
+pip install tensorflow-gpu
+```
+for python 3:
+```
+conda create -n foa_env python=3.5 anaconda
+source activate foa_env
+conda install -c menpo opencv
+conda install -c menpo menpo
+conda install -c menpo menpofit
+pip3 install tensorflow-gpu
+```
+Clone repository:
+```
+git clone https://github.com/papulke/deep_face_heatmaps
+```
+## Instructions
+### Training
+To train the network you need to run `train_heatmaps_network.py`
+example for training a model with texture augmentation (100% of images) and geometric augmentation (~70% of images):
+```
+python train_heatmaps_network.py --output_dir='test_artistic_aug' --augment_geom=True \
+--augment_texture=True --p_texture=1. --p_geom=0.7
+```
+### Testing
+For using the detection framework to predict landmarks, run the script `predict_landmarks.py`
+## Acknowledgments
+* [ect](https://github.com/HongwenZhang/ECT-FaceAlignment)
+* [menpo](https://github.com/menpo/menpo)
+* [menpofit](https://github.com/menpo/menpofit)
+* [mdm](https://github.com/trigeorgis/mdm)
+* [style transfer implementation](https://github.com/woodrush/neural-art-tf)
+* [painter-by-numbers dataset](https://www.kaggle.com/c/painter-by-numbers/data)

marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__init__.py ADDED Viewed

File without changes

marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__init__.pyc ADDED Viewed

Binary file (161 Bytes). View file

marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/__init__.cpython-36.pyc ADDED Viewed

Binary file (157 Bytes). View file

marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/data_loading_functions.cpython-36.pyc ADDED Viewed

Binary file (4.56 kB). View file

marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/deep_heatmaps_model_fusion_net.cpython-36.pyc ADDED Viewed

Binary file (21.6 kB). View file

marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/deformation_functions.cpython-36.pyc ADDED Viewed

Binary file (9 kB). View file

marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/logging_functions.cpython-36.pyc ADDED Viewed

Binary file (5.81 kB). View file

marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/menpo_functions.cpython-36.pyc ADDED Viewed

Binary file (9.22 kB). View file

marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/ops.cpython-36.pyc ADDED Viewed

Binary file (3.6 kB). View file

marlenezw/audio-driven-animations/MakeItTalk/face_of_art/__pycache__/pdm_clm_functions.cpython-36.pyc ADDED Viewed

Binary file (6.34 kB). View file