Using Auditory Models for Speaker Normalization in Speech Recognition


  • Anthony Bladon


Auditorily-transformed versions of the speech spectrum may well be a useful way of reducing the apparently nonuniform physical differences between speakers. A speaker normalization technique of this kind is however justified to different degrees by different kinds of speech event. Does this presuppose a need for higher-level (phonetic class) information at the acoustic level in speaker-independent ASR?

"It is obvious from our experiment that the unqualified assumption does not hold - auditory models used as speech recognition front ends will not consistently improve performance."

Blomberg et al.'s (1984) ominous words are ones which this symposium ought. to take seriously to heart. They conflict with our initial theoretical expectations. This paper will not attempt to investigate what reasons lie behind the inconsistent results which some authors have found. Rather, we will focus on an aspect of the speech recognition task where the prognosis for auditory modelling promises ta bear some fruit, namely, speaker differences (in speaker-independent speech recognition).




How to Cite

Bladon A. Using Auditory Models for Speaker Normalization in Speech Recognition. Canadian Acoustics [Internet]. 2022 Dec. 3 [cited 2023 Feb. 8];14(3 bis):16-7. Available from:



Proceedings of the Acoustics Week in Canada