Strategies to Enhance Whispered Speech Speaker Verification: A Comparative Analysis

Authors

  • Milton O. Sarria-Paja Institut National de la Recherche Scientifique, Centre EMT, University of Quebec
  • Tiago H. Falk Institut National de la Recherche Scientifique, Centre EMT, University of Quebec

Keywords:

Whispered speech, gender detection, speaker verification, instantaneous frequency, vocal effort classification, modulation spectrum.

Abstract

Today,  automated speech-enabled tools are increasingly being used  in everyday environments. This mobility has created new challenges for developers, who are now faced with input speech of varying styles (e.g. whispered) and corrupted by different noise sources. In this paper, special emphasis is placed on whispered speech, an underexplored yet burgeoning area due to the rapid proliferation of smartphones around the world. More specifically, this paper explores the performance boundaries achievable with whispered speech for a speaker verification task, both in matched and mismatched train/test conditions. Several strategies are investigated to improve the performance in the mismatched scenario, as well as in situations involving ambient noise. Our results agree with previously reported studies in adjacent areas, that significant gains could be obtained by training speaker models with both naturally voiced and whispered speech data. Moreover, additional gains could be achieved with speaking style and gender dependent systems. Overall, speaker verification performance inline with that obtained with naturally-voiced speech could be attained for whispered speech once specific strategies were put in place. Particularly, feature fusion showed to be an important strategy for practical applications in both clean and noisy conditions.

Author Biographies

Milton O. Sarria-Paja, Institut National de la Recherche Scientifique, Centre EMT, University of Quebec

PhD estudent

Institut National de la Recherche Scientifique, Centre EMT, University of Quebec

Tiago H. Falk, Institut National de la Recherche Scientifique, Centre EMT, University of Quebec

Assistant Professor, INRS-EMT
Director, MuSAE Lab

Additional Files

Published

2015-12-15

How to Cite

1.
Sarria-Paja MO, Falk TH. Strategies to Enhance Whispered Speech Speaker Verification: A Comparative Analysis. Canadian Acoustics [Internet]. 2015 Dec. 15 [cited 2024 Mar. 28];43(4):31-45. Available from: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/2670

Issue

Section

Article - Speech Sciences