A Spectral-Temporal Suppression Hodel for Speech Recognition
Résumé
Speech recognition systems, however heterogeneous in their conceptions and schemes, share at least one basic feature: the inclusion of a vocoder-type front-end. While many of the early, and some of the contemporary, systems adopted a pragmatic design for their front-end filter bank, there were some efforts (e.g., Chistovich et al., 1975; Searle et al., 1979) toward providing the recognizer with an input stage that was modeled after the human ear. The motivation for such a design was the desire to optimize the recognition process from the very first stage on. However, work by auditory physiologists on auditory nerve responses to speech (Young and Sachs, 1979; Delgutte, 1980) signaled a welcome convergence of interests by two groups of scientists on the problem of speech processing in the auditory system. More recent work by several investigators, some of which is included in the present symposium, has been directed toward designing recognizer frontends that resembled the ear more-and-more closely, and toward examining effects of model parameter modifications on recognition performance.
Computational models of the auditory system fall into two major classes, depending on whether the calculations are performed in the time or in the spectral domain. The advantage of time-domain algorithms lies mainly in their speed, whereas spectrally-based algorithms may more closely approximate the actual auditory processes because they are able to deal more directly with non-linear filtering operations. The present model is spectral in the sense that the filtering computations are executed in the frequency domain.
Fichiers supplémentaires
Publié-e
Comment citer
Numéro
Rubrique
Licence
Author Licensing Addendum
This Licensing Addendum ("Addendum") is entered into between the undersigned Author(s) and Canadian Acoustics journal published by the Canadian Acoustical Association (hereinafter referred to as the "Publisher"). The Author(s) and the Publisher agree as follows:
-
Retained Rights: The Author(s) retain(s) the following rights:
- The right to reproduce, distribute, and publicly display the Work on the Author's personal website or the website of the Author's institution.
- The right to use the Work in the Author's teaching activities and presentations.
- The right to include the Work in a compilation for the Author's personal use, not for sale.
-
Grant of License: The Author(s) grant(s) to the Publisher a worldwide exclusive license to publish, reproduce, distribute, and display the Work in Canadian Acoustics and any other formats and media deemed appropriate by the Publisher.
-
Attribution: The Publisher agrees to include proper attribution to the Author(s) in all publications and reproductions of the Work.
-
No Conflict: This Addendum is intended to be in harmony with, and not in conflict with, the terms and conditions of the original agreement entered into between the Author(s) and the Publisher.
-
Copyright Clause: Copyright on articles is held by the Author(s). The corresponding Author has the right to grant on behalf of all Authors and does grant on behalf of all Authors, a worldwide exclusive license to the Publisher and its licensees in perpetuity, in all forms, formats, and media (whether known now or created in the future), including but not limited to the rights to publish, reproduce, distribute, display, store, translate, create adaptations, reprints, include within collections, and create summaries, extracts, and/or abstracts of the Contribution.