Strategies to Enhance Whispered Speech Speaker Verification: A Comparative Analysis

Milton O. Sarria-Paja; Tiago H. Falk

Strategies to Enhance Whispered Speech Speaker Verification: A Comparative Analysis

Authors

Milton O. Sarria-Paja Institut National de la Recherche Scientifique, Centre EMT, University of Quebec
Tiago H. Falk Institut National de la Recherche Scientifique, Centre EMT, University of Quebec

Keywords:

Whispered speech, gender detection, speaker verification, instantaneous frequency, vocal effort classification, modulation spectrum.

Abstract

Today, automated speech-enabled tools are increasingly being used in everyday environments. This mobility has created new challenges for developers, who are now faced with input speech of varying styles (e.g. whispered) and corrupted by different noise sources. In this paper, special emphasis is placed on whispered speech, an underexplored yet burgeoning area due to the rapid proliferation of smartphones around the world. More specifically, this paper explores the performance boundaries achievable with whispered speech for a speaker verification task, both in matched and mismatched train/test conditions. Several strategies are investigated to improve the performance in the mismatched scenario, as well as in situations involving ambient noise. Our results agree with previously reported studies in adjacent areas, that significant gains could be obtained by training speaker models with both naturally voiced and whispered speech data. Moreover, additional gains could be achieved with speaking style and gender dependent systems. Overall, speaker verification performance inline with that obtained with naturally-voiced speech could be attained for whispered speech once specific strategies were put in place. Particularly, feature fusion showed to be an important strategy for practical applications in both clean and noisy conditions.

Author Biographies

Milton O. Sarria-Paja, Institut National de la Recherche Scientifique, Centre EMT, University of Quebec

PhD estudent

Institut National de la Recherche Scientifique, Centre EMT, University of Quebec

Tiago H. Falk, Institut National de la Recherche Scientifique, Centre EMT, University of Quebec

Assistant Professor, INRS-EMT
Director, MuSAE Lab

Additional Files

Published

2015-12-15

How to Cite

Sarria-Paja MO, Falk TH. Strategies to Enhance Whispered Speech Speaker Verification: A Comparative Analysis. Canadian Acoustics [Internet]. 2015 Dec. 15 [cited 2026 May 21];43(4):31-45. Available from: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/2670

Download Citation

Issue

Vol. 43 No. 4 (2015)

Section

Article - Speech Sciences

License

Author Licensing Addendum

This Licensing Addendum ("Addendum") is entered into between the undersigned Author(s) and Canadian Acoustics journal published by the Canadian Acoustical Association (hereinafter referred to as the "Publisher"). The Author(s) and the Publisher agree as follows:

Retained Rights: The Author(s) retain(s) the following rights:
- The right to reproduce, distribute, and publicly display the Work on the Author's personal website or the website of the Author's institution.
- The right to use the Work in the Author's teaching activities and presentations.
- The right to include the Work in a compilation for the Author's personal use, not for sale.
Grant of License: The Author(s) grant(s) to the Publisher a worldwide exclusive license to publish, reproduce, distribute, and display the Work in Canadian Acoustics and any other formats and media deemed appropriate by the Publisher.
Attribution: The Publisher agrees to include proper attribution to the Author(s) in all publications and reproductions of the Work.
No Conflict: This Addendum is intended to be in harmony with, and not in conflict with, the terms and conditions of the original agreement entered into between the Author(s) and the Publisher.
Copyright Clause: Copyright on articles is held by the Author(s). The corresponding Author has the right to grant on behalf of all Authors and does grant on behalf of all Authors, a worldwide exclusive license to the Publisher and its licensees in perpetuity, in all forms, formats, and media (whether known now or created in the future), including but not limited to the rights to publish, reproduce, distribute, display, store, translate, create adaptations, reprints, include within collections, and create summaries, extracts, and/or abstracts of the Contribution.

Strategies to Enhance Whispered Speech Speaker Verification: A Comparative Analysis

Authors

Keywords:

Abstract

Author Biographies

Milton O. Sarria-Paja, Institut National de la Recherche Scientifique, Centre EMT, University of Quebec

Tiago H. Falk, Institut National de la Recherche Scientifique, Centre EMT, University of Quebec

Additional Files

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Language

Subscription

Make a Submission

Information