Familiar talker advantages in formant-based and concatenative synthetic speech

Auteurs-es

  • Jacqueline Jones Dept. of Linguistics, University of Calgary, 2500 University Drive NW, AB T2N 1N4, Canada

Mots-clés :

American English, Bisyllabic words, Harvard, Synthetic speech, Text to speech, Training phase

Résumé

Access to synthetic speech technology has never been easier than it is today. Home computers come bundled with text-to-speech software, as do some eReaders and smart phones. The technology has come a long way since Stephen Hawking's recognizable DECTaIk voice in the late 1980s. Four sets of stimuli were created for this research. The threshold stimuli consisted of 70 pre-recorded spondees ? bisyllabic words with equal stress on both syllables - produced by a native speaker of American English. Training, Testing, and Post-Test stimuli consisted of pie- recorded sets of Harvard Sentences produced by synthetic speech. The training phase consisted of 60 sentences produced by a synthetic speaker. Groups 1 and 2 trained with the concatenative voice Eric, and groups 3 and 4 trained with formant voice Wheatley. The participants listened to the sentence a single time and were asked to transcribe what they heard.

Fichiers supplémentaires

Publié-e

2012-09-01

Comment citer

1.
Jones J. Familiar talker advantages in formant-based and concatenative synthetic speech. Canadian Acoustics [Internet]. 1 sept. 2012 [cité 12 mai 2026];40(3):26-7. Disponible à: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/2522

Numéro

Rubrique

Actes du congrès de la Semaine canadienne d'acoustique