Using Auditory Models for Speaker Normalization in Speech Recognition

Anthony Bladon

Using Auditory Models for Speaker Normalization in Speech Recognition

Authors

Anthony Bladon

Abstract

Auditorily-transformed versions of the speech spectrum may well be a useful way of reducing the apparently nonuniform physical differences between speakers. A speaker normalization technique of this kind is however justified to different degrees by different kinds of speech event. Does this presuppose a need for higher-level (phonetic class) information at the acoustic level in speaker-independent ASR?

"It is obvious from our experiment that the unqualified assumption does not hold - auditory models used as speech recognition front ends will not consistently improve performance."

Blomberg et al.'s (1984) ominous words are ones which this symposium ought. to take seriously to heart. They conflict with our initial theoretical expectations. This paper will not attempt to investigate what reasons lie behind the inconsistent results which some authors have found. Rather, we will focus on an aspect of the speech recognition task where the prognosis for auditory modelling promises ta bear some fruit, namely, speaker differences (in speaker-independent speech recognition).

Additional Files

Published

2022-12-03

How to Cite

Bladon A. Using Auditory Models for Speaker Normalization in Speech Recognition. Canadian Acoustics [Internet]. 2022 Dec. 3 [cited 2026 Jul. 16];14(3 bis):16-7. Available from: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/3501

Download Citation

Issue

Vol. 14 No. 3 bis (1986): Montreal Symposium On Speech Recognition

Section

Proceedings of the Acoustics Week in Canada

License

Author Licensing Addendum

This Licensing Addendum ("Addendum") is entered into between the undersigned Author(s) and Canadian Acoustics journal published by the Canadian Acoustical Association (hereinafter referred to as the "Publisher"). The Author(s) and the Publisher agree as follows:

Retained Rights: The Author(s) retain(s) the following rights:
- The right to reproduce, distribute, and publicly display the Work on the Author's personal website or the website of the Author's institution.
- The right to use the Work in the Author's teaching activities and presentations.
- The right to include the Work in a compilation for the Author's personal use, not for sale.
Grant of License: The Author(s) grant(s) to the Publisher a worldwide exclusive license to publish, reproduce, distribute, and display the Work in Canadian Acoustics and any other formats and media deemed appropriate by the Publisher.
Attribution: The Publisher agrees to include proper attribution to the Author(s) in all publications and reproductions of the Work.
No Conflict: This Addendum is intended to be in harmony with, and not in conflict with, the terms and conditions of the original agreement entered into between the Author(s) and the Publisher.
Copyright Clause: Copyright on articles is held by the Author(s). The corresponding Author has the right to grant on behalf of all Authors and does grant on behalf of all Authors, a worldwide exclusive license to the Publisher and its licensees in perpetuity, in all forms, formats, and media (whether known now or created in the future), including but not limited to the rights to publish, reproduce, distribute, display, store, translate, create adaptations, reprints, include within collections, and create summaries, extracts, and/or abstracts of the Contribution.

Using Auditory Models for Speaker Normalization in Speech Recognition

Authors

Abstract

Additional Files

Published

How to Cite

Issue

Section

License

Language

Subscription

Make a Submission

Information