![]() Voice biometrics for user authentication is a task in which the object is to perform convenient, robust and secure authentication of speakers. New Developments in Voice Biometrics for User Authentication Both applications achieve near to state-of-the-art results while benefiting from performing most processing in the binary space. In this paper we explain the main benefits of our recently proposed binary speaker modeling technique and show its benefits in two particular applications, namely for speaker recognition and speaker diarization. In addition, they place most emphasis on modeling the most recurrent acoustic events, instead of less occurring speaker discriminant information. Although many of them imply the evaluation of high dimensional feature vectors and represent a speaker with a single vector, therefore not using any temporal information. Many statistical speaker modeling techniques that deviate from the classical GMM/UBM approach have been proposed for some time now that can accurately discriminate between speakers. Pierre-Michel Bousquet (University of Avignon)Īchieving an accurate speaker modeling is a crucial step in any speaker-related algorithm. Sierra (Advanced Technologies Application Center) Jean-Francois Bonastre (University of Avignon) Speaker modeling using local binary decisions #MINORU KURATA SMART TRASH CAN VERIFICATION#Furthermore, when using a single additional verification score from the true speaker for ranking, the false-reject of the 1% lowest-ranking sessions rises up to 33%. We explore several features and show that the 1% lowest-ranking enrollments have a false reject rate of up to 7.8%, compared to our system’s overall rate of 2.0%. The lowest-ranking sessions are likely to have a high false-reject rate. We then rank all the enrollment sessions in the system based on this feature. We begin with extracting an appropriate feature from each enrollment session. The method normally uses only the enrollment data to perform this task. We present a method that identifies speakers that are likely to have a high false-reject rate in a text-dependent speaker verification system (“goats”). ![]() Watson Research Center)ĭavid Nahamoo (IBM T.J. Orith Toledo-Ronen (IBM Research – Haifa) Towards Goat Detection in Text-Dependent Speaker Verification They indicated that skew-Gaussians are better suited for capturing the relatively highly non-symmetrical shapes of the LSF distribution and thus the skew-GMM with LSF offers a worthy alternative to the GMM-MFCC pair for speaker recognition. Results showed that the skew-GMM, with LSF, compares favorably with the GMM-MFCC pair (under fair comparison conditions). Each model type was evaluated using two sets of feature vectors, the mel-frequency cepstral coefficients (MFCC), that are widely used in speaker recognition applications, and line spectra frequencies (LSF), that are used in many low bit rate speech coders but were not that successful in speech and speaker recognition. Speaker identification experiments were conducted, in which speakers were modeled using the familiar Gaussian mixture models (GMM), and the new skew-GMM. The current paper proposes skew Gaussian mixture models for speaker recognition and an associated algorithm for its training from experimental data. Skew Gaussian mixture models for speaker recognition Interspeech 2011 Technical Programme Sun-Ses2-O1: Firenze Fiera Congress & Exhibition Center.Major Hotels within Minutes of the Conference Site.SS-5 - Speech Technology for Under-Resourced Languages.SS-4 - Speech and audio processing for human-robot interaction.SS-3 - Spoken language processing of human-human conversations.SS-2 - Crowdsourcing for speech processing.SS-1 - Speech and language processing based assistive technologies and health applications.A3-Low-Dimensional Speech Representation Based on Factor Analysis and its Applications.A1-Functional Data Analysis for Speech Research.M5-Blind Speech Separation based on Independent Component Analysis and Sparse Component Analysis. ![]() #MINORU KURATA SMART TRASH CAN SOFTWARE#M3-Building an Open Vocabulary ASR System using Open Source Software.M2-Pitch Estimation: Advances and Speech Application.M1-More than Words Can Say: Prosodic Analysis Techniques and Applications. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |