Familiar talker advantages in formant-based and concatenative synthetic speech

Authors

  • Jacqueline Jones Dept. of Linguistics, University of Calgary, 2500 University Drive NW, AB T2N 1N4, Canada

Keywords:

American English, Bisyllabic words, Harvard, Synthetic speech, Text to speech, Training phase

Abstract

Access to synthetic speech technology has never been easier than it is today. Home computers come bundled with text-to-speech software, as do some eReaders and smart phones. The technology has come a long way since Stephen Hawking's recognizable DECTaIk voice in the late 1980s. Four sets of stimuli were created for this research. The threshold stimuli consisted of 70 pre-recorded spondees ? bisyllabic words with equal stress on both syllables - produced by a native speaker of American English. Training, Testing, and Post-Test stimuli consisted of pie- recorded sets of Harvard Sentences produced by synthetic speech. The training phase consisted of 60 sentences produced by a synthetic speaker. Groups 1 and 2 trained with the concatenative voice Eric, and groups 3 and 4 trained with formant voice Wheatley. The participants listened to the sentence a single time and were asked to transcribe what they heard.

Additional Files

Published

2012-09-01

How to Cite

1.
Jones J. Familiar talker advantages in formant-based and concatenative synthetic speech. Canadian Acoustics [Internet]. 2012 Sep. 1 [cited 2024 Nov. 3];40(3):26-7. Available from: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/2522

Issue

Section

Proceedings of the Acoustics Week in Canada