CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

On the Selection and Classification of Features for Speaker Recognition

Guillermo Garcia (Institutionen för signaler och system, Kommunikationssystem)
Göteborg : Chalmers University of Technology, 2010. ISBN: 978-91-7385-452-8.- 160 s.
[Doktorsavhandling]

Speaker recognition is the process of automatically recognizing who is spea- king based on information provided by speech signals. Speaker recognition still has many unsolved problems; each person has a unique voice that makes him or her different from other people, and to recognize who is talking based only on the speech is not an easy task. In this thesis, we propose methods to improve the performance of the speaker recognizer and address issues related to the performance measurement in speaker recognition systems. This thesis consists of three main research parts focusing on the problems on feature extraction, speaker modeling and performance evaluation of speaker recognition systems. In the feature extraction part, we focus on the extraction of phase information features and features inspired in the physiological functions of the brain in order to improve the performance of speaker recognition systems. Then, we address a information theoretical method to compute the amount of information that a feature set can contain about the speaker. In the speaker modeling part, we develop a step descent algorithm, that can be used as an alternative to the Expectation Maximization (EM) algorithm to tackle the convergence problems. Moreover, we discuss the estimation of the speaker model parameters using Bayes estimation as a solution to improve the performance and reduce the mismatch between the training and the evaluation. We also propose a modeling approach based on discriminative weights with similar complexity as the conventional modeling technique used for speaker identification systems In the performance evaluation part, we propose a statistical method based on the log-likelihood from a set of speaker test samples to estimate the probability of error when the number of available tests is limited.

Nyckelord: Bayes procedures, estimation, speaker recognition, discriminative modeling, feature extraction, mutual information, phase information.



Denna post skapades 2010-10-25. Senast ändrad 2013-09-25.
CPL Pubid: 128071

 

Institutioner (Chalmers)

Institutionen för signaler och system, Kommunikationssystem

Ämnesområden

Signalbehandling

Chalmers infrastruktur

Examination

Datum: 2010-11-19
Tid: 10:00
Lokal: EE-salen
Opponent: Professor Chin-Hui Lee

Ingår i serie

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie 3133