Familiar talker advantages in formant-based and concatenative synthetic speech
Keywords:American English, Bisyllabic words, Harvard, Synthetic speech, Text to speech, Training phase
AbstractAccess to synthetic speech technology has never been easier than it is today. Home computers come bundled with text-to-speech software, as do some eReaders and smart phones. The technology has come a long way since Stephen Hawking's recognizable DECTaIk voice in the late 1980s. Four sets of stimuli were created for this research. The threshold stimuli consisted of 70 pre-recorded spondees ? bisyllabic words with equal stress on both syllables - produced by a native speaker of American English. Training, Testing, and Post-Test stimuli consisted of pie- recorded sets of Harvard Sentences produced by synthetic speech. The training phase consisted of 60 sentences produced by a synthetic speaker. Groups 1 and 2 trained with the concatenative voice Eric, and groups 3 and 4 trained with formant voice Wheatley. The participants listened to the sentence a single time and were asked to transcribe what they heard.
How to Cite
Copyright on articles is held by the author(s). The corresponding author has the right to grant on behalf of all authors and does grant on behalf of all authors, a worldwide exclusive licence (or non-exclusive license for government employees) to the Publishers and its licensees in perpetuity, in all forms, formats and media (whether known now or created in the future)
i) to publish, reproduce, distribute, display and store the Contribution;
ii) to translate the Contribution into other languages, create adaptations, reprints, include within collections and create summaries, extracts and/or, abstracts of the Contribution;
iii) to exploit all subsidiary rights in the Contribution,
iv) to provide the inclusion of electronic links from the Contribution to third party material where-ever it may be located;
v) to licence any third party to do any or all of the above.