Toward better automatic speech recognition

Auteurs-es

  • D. O'Shaughnessy INRS-EMT, U. of Quebec, 800 de la Gauchetiere west, Montreal, Que. H5A 1K6, Canada
  • W. Wang INRS-EMT, U. of Quebec, 800 de la Gauchetiere west, Montreal, Que. H5A 1K6, Canada
  • W. Zhu INRS-EMT, U. of Quebec, 800 de la Gauchetiere west, Montreal, Que. H5A 1K6, Canada
  • V. Barreaud INRS-EMT, U. of Quebec, 800 de la Gauchetiere west, Montreal, Que. H5A 1K6, Canada
  • T. Nagarajan INRS-EMT, U. of Quebec, 800 de la Gauchetiere west, Montreal, Que. H5A 1K6, Canada
  • R. Muralishankar INRS-EMT, U. of Quebec, 800 de la Gauchetiere west, Montreal, Que. H5A 1K6, Canada

Mots-clés :

Automation, Cosine transforms, Data acquisition, Speech analysis, Statistical methods, Non-linear approach, Speech environments, Vowel recognition, Warped discrete cosine transform cepstrum (WDCTC)

Résumé

Various model techniques to adapt to various speech environments without modifying the basic automatic speech recognition were developed. Statistical data mapping assumes that speech observations are generated by subsets of mutually related random sources. It is a non-linear approach and has the strength to handle non-time-invariant variations. The warped discrete cosine transform cepstrum (WDCTC) has a better performance in a 5-vowel recognition and speaker identification task.

Fichiers supplémentaires

Publié-e

2005-09-01

Comment citer

1.
O’Shaughnessy D, Wang W, Zhu W, Barreaud V, Nagarajan T, Muralishankar R. Toward better automatic speech recognition. Canadian Acoustics [Internet]. 1 sept. 2005 [cité 21 févr. 2025];33(3):48-9. Disponible à: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/1739

Numéro

Rubrique

Actes du congrès de la Semaine canadienne d'acoustique