Please use this identifier to cite or link to this item:
Title: Channel adaptive speaker recognition
Authors: Greige, Nayla
Chammas, Edgar
Advisors: Mokbel, Chafic 
Subjects: Speech processing systems
Issue Date: 2012
Nowadays, numerous applications involve speaker recognition techniques that aim at authenticating or verifying a user who is trying to access a service or system. State of the art unsupervised and text-independent speaker recognition systems are still far from being perfect at the moment due to channel variations and disturbance problems that often alter the speakers speech data that are being collected by the system. Most of the time, the system does not have enough data to make a good statistical decision [1]. For this reason, and in some critical applications that require high level of security, we cannot only rely on such systems to authenticate users. However, they may be essential to improve security [2],for example in the case when they are used in two-factor authentication mechanisms to further enhance security (e.g. In addition to authenticating users by checking their passwords input). The work in this project represents a baseline for an automatic speaker recognition system. It establishes the basic requirements for a state-of-the-art system that involves speech features extraction, feature matching and decision making. State of the art textindependent speaker verification systems nowadays are adopting either of the two concepts: GMM (Gaussian Mixture Model) or SVM (Support Vector Machine); or sometimes a hybrid system that is based on both. The work in this present document will be based on the GMM modeling; precisely the GMM-UBM verification system [3]. The system makes use of ceptral normalization to reduce the noise effects on the channel. It uses the classical MAP adaptation technique that permits to obtain the speaker's GMM parameters from the UBM (Universal Background) parameters. The UBM was trained using data from NIST' 2004 speaker recognition evaluation while the test speakers were taken from the NIST' 2008 speaker recognition evaluation. The system was tested on female speakers and has achieved 36% as equal error rate (EER).
Includes bibliographical references (p. 59-65).

Supervised by Dr. Chafic Mokbel.
Rights: This object is protected by copyright, and is made available here for research and educational purposes. Permission to reuse, publish, or reproduce the object beyond the personal and educational use exceptions must be obtained from the copyright holder
Ezproxy URL: Link to full text
Type: Project
Appears in Collections:UOB Theses and Projects

Show full item record

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.