Using Auditory Models for Speaker Normalization in Speech Recognition

Auteurs-es

  • Anthony Bladon

Résumé

Auditorily-transformed versions of the speech spectrum may well be a useful way of reducing the apparently nonuniform physical differences between speakers. A speaker normalization technique of this kind is however justified to different degrees by different kinds of speech event. Does this presuppose a need for higher-level (phonetic class) information at the acoustic level in speaker-independent ASR?

"It is obvious from our experiment that the unqualified assumption does not hold - auditory models used as speech recognition front ends will not consistently improve performance."

Blomberg et al.'s (1984) ominous words are ones which this symposium ought. to take seriously to heart. They conflict with our initial theoretical expectations. This paper will not attempt to investigate what reasons lie behind the inconsistent results which some authors have found. Rather, we will focus on an aspect of the speech recognition task where the prognosis for auditory modelling promises ta bear some fruit, namely, speaker differences (in speaker-independent speech recognition).

Fichiers supplémentaires

Publié-e

2022-12-03

Comment citer

1.
Bladon A. Using Auditory Models for Speaker Normalization in Speech Recognition. Canadian Acoustics [Internet]. 3 déc. 2022 [cité 21 nov. 2024];14(3 bis):16-7. Disponible à: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/3501

Numéro

Rubrique

Actes du congrès de la Semaine canadienne d'acoustique