AVSpeech

AVSpeech compiled in tensorflow

Ariel E , Inbar M , Oran L , et al. Looking to listen at the cocktail party[J]. ACM Transactions on Graphics, 2018, 37(4):1-11.

different from this paper, we use resnet-18 to replace the dilated CNNs to save memory for GPU and enlarge the batch size. However, to compensate the differences between audio sampling and video sampling. Deconvolution layers are used. As a result, the output size is the same rather than variable like the model in the paper.

resnet model comes from https://github.com/ry/tensorflow-resnet

facenet is used to pre-process the audio part.
facenet model comes from https://github.com/davidsandberg/facenet

there are two main codes
Googletrain_audioonly.py is used to train a model without the video part.
PIT(Permutation invariant training) is used to train the audio only model.

Googletrain.py is used to train a model with both video and audio parts.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
align		align
.gitignore		.gitignore
Googlemodel.py		Googlemodel.py
Googletrain.py		Googletrain.py
Googletrain_audioonly.py		Googletrain_audioonly.py
README.md		README.md
config.py		config.py
eachmat2all.py		eachmat2all.py
eachmat2batches.py		eachmat2batches.py
eachmat2batcheswav.py		eachmat2batcheswav.py
face_detect.py		face_detect.py
facenet.py		facenet.py
haarcascade_frontalface_default.xml		haarcascade_frontalface_default.xml
loaddata.py		loaddata.py
maketfrecord.py		maketfrecord.py
maketfrecord_AVSpeech.py		maketfrecord_AVSpeech.py
model1.py		model1.py
model1audioonly.py		model1audioonly.py
pb2tensorboard.py		pb2tensorboard.py
pic2feature.py		pic2feature.py
resnet.py		resnet.py
video2frame.py		video2frame.py
wav2part.py		wav2part.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AVSpeech

About

Releases

Packages

Languages

Vanka0051/AVSpeech

Folders and files

Latest commit

History

Repository files navigation

AVSpeech

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages