Audio-Assisted Image Inpainting for Talking Faces
Alexandros Koumparoulis, Gerasimos Potamianos, et al.
ICASSP 2020
In this paper, we propose a new fast and flexible algorithm based on the maximum entropy (MAXENT) criterion to estimate stream weights in a state-synchronous multi-stream HMM. The technique is compared to the minimum classification error (MCE) criterion and to a brute-force, grid-search optimization of the WER on both a small and a large vocabulary audio-visual continuous speech recognition task. When estimating global stream weights, the MAXENT approach gives comparable results to the grid-search and the MCE. Estimation of state dependent weights is also considered: We observe significant improvements in both the MAXENT and MCE criteria, which, however, do not result in significant WER gains.
Alexandros Koumparoulis, Gerasimos Potamianos, et al.
ICASSP 2020
Chalapathy Neti, Gerasimos Potamianos, et al.
MMSP 2001
Ambrish Tyagi, Mark Keck, et al.
CVPR 2007
Alexandros Koumparoulis, Gerasimos Potamianos, et al.
INTERSPEECH 2020