Finding schwa: Comparing the results of an automatic aligner with human judgments when identifying schwa in a corpus of spoken French
Keywords:Linguistics, Empirical data, Human judgments, Labor intensive, Linguistic data, Natural languages, Test research
AbstractThe article compares the results of an automatic aligner with human judgments when identifying schwa in a corpus of spoken French. The value in working with natural language corpora is the ability to collect large volumes of empirical data with which to test research hypotheses. The challenge is to generate these data quickly and accurately. Accumulating the linguistic data required to test and evaluate hypotheses can be a time consuming and labor intensive job. All data was systematically coded for presence or absence of schwa by trained researchers. The data was also time aligned at both the word and phone level by a forced aligner. The results of the two methods of coding were statistically compared to determine their level of agreement. Results show a significant correlation between the two methods and a high likelihood of overall agreement. Possible effects of dialect or phonetic context were investigated using a two-way, between subjects analysis of variance.
How to Cite
Copyright on articles is held by the author(s). The corresponding author has the right to grant on behalf of all authors and does grant on behalf of all authors, a worldwide exclusive licence (or non-exclusive license for government employees) to the Publishers and its licensees in perpetuity, in all forms, formats and media (whether known now or created in the future)
i) to publish, reproduce, distribute, display and store the Contribution;
ii) to translate the Contribution into other languages, create adaptations, reprints, include within collections and create summaries, extracts and/or, abstracts of the Contribution;
iii) to exploit all subsidiary rights in the Contribution,
iv) to provide the inclusion of electronic links from the Contribution to third party material where-ever it may be located;
v) to licence any third party to do any or all of the above.