Strategies to Enhance Whispered Speech Speaker Verification: A Comparative Analysis

  • Milton O. Sarria-Paja Institut National de la Recherche Scientifique, Centre EMT, University of Quebec
  • Tiago H. Falk Institut National de la Recherche Scientifique, Centre EMT, University of Quebec
Keywords: Whispered speech, gender detection, speaker verification, instantaneous frequency, vocal effort classification, modulation spectrum.

Abstract

Today,  automated speech-enabled tools are increasingly being used  in everyday environments. This mobility has created new challenges for developers, who are now faced with input speech of varying styles (e.g. whispered) and corrupted by different noise sources. In this paper, special emphasis is placed on whispered speech, an underexplored yet burgeoning area due to the rapid proliferation of smartphones around the world. More specifically, this paper explores the performance boundaries achievable with whispered speech for a speaker verification task, both in matched and mismatched train/test conditions. Several strategies are investigated to improve the performance in the mismatched scenario, as well as in situations involving ambient noise. Our results agree with previously reported studies in adjacent areas, that significant gains could be obtained by training speaker models with both naturally voiced and whispered speech data. Moreover, additional gains could be achieved with speaking style and gender dependent systems. Overall, speaker verification performance inline with that obtained with naturally-voiced speech could be attained for whispered speech once specific strategies were put in place. Particularly, feature fusion showed to be an important strategy for practical applications in both clean and noisy conditions.

Author Biographies

Milton O. Sarria-Paja, Institut National de la Recherche Scientifique, Centre EMT, University of Quebec

PhD estudent

Institut National de la Recherche Scientifique, Centre EMT, University of Quebec

Tiago H. Falk, Institut National de la Recherche Scientifique, Centre EMT, University of Quebec
Assistant Professor, INRS-EMT
Director, MuSAE Lab
Published
2015-12-15
How to Cite
1.
Sarria-Paja MO, Falk TH. Strategies to Enhance Whispered Speech Speaker Verification: A Comparative Analysis. Canadian Acoustics [Internet]. 2015Dec.15 [cited 2019Sep.17];43(4):31-5. Available from: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/2670
Section
Article - Speech Sciences