An audio-visual imposture scenario by talking face animation

Karam, Walid; Mokbel, Chafic; Greige, Hanna; Aversano, Guido; Pelachaud, Catherine; Chollet, Gérard

UOBScholar Hub

UOB Libraries created the UOBScholar Hub, an Institutional Repository (IR) for archiving and collecting all the research output of the UOB community. It aims to improve the visibility, usage and impact of research conducted at UOB. Materials included are: academic journal articles, conference papers and presentations, books and book chapters, ongoing research papers, reports and patents.

Please use this identifier to cite or link to this item: https://scholarhub.balamand.edu.lb/handle/uob/409

Title:	An audio-visual imposture scenario by talking face animation
Authors:	Karam, Walid Mokbel, Chafic Greige, Hanna Aversano, Guido Pelachaud, Catherine Chollet, Gérard
Affiliations:	Department of Computer Engineering Department of Electrical Engineering Department of Mathematics
Keywords:	Equal Error Rate Visual Speech Speaker Verification Automatic Speech Recognition System Speaker Verification System
Issue Date:	2004
Part of:	Nonlinear Speech Modeling and Applications
Start page:	365
End page:	369
Conference:	International School on Neural Networks, Initiated by IIASS and EMFCSC (13-18 September 2004 : Salerno, Italy)
Abstract:	We describe a system that allows an impostor to lead an audio-visual telephone conversation, and sign data electronically on behalf of an authorized client. During the conversation, audio and video of the impostor are altered so as to mimic the client. The voice of an impostor is processed and used to reproduce the voice of the authorized client. Speech segments obtained from clients recordings are used to synthesize new sentences that the client never pronounced. On the visual side, the imposters talking face is detected and facial features are extracted and used to animate a synthetic talking face. The texture of the impersonated face is mapped onto the talking head and coded for transmission over the phone, along with the synthesized voice. Audio-visual coding and synthesis is realized by indexing in a memory containing audio-visual sequences. Stochastic models (coupled HMM) of characteristic segments are used to drive the memory search.
URI:	https://scholarhub.balamand.edu.lb/handle/uob/409
Ezproxy URL:	Link to full text
Type:	Conference Paper
Appears in Collections:	Department of Computer Engineering

Show full item record

Record view(s)

85

checked on Nov 23, 2024

Google Scholar^TM

Check

UOBScholar Hub

Record view(s)

Google ScholarTM

Google Scholar^TM