Computer Assisted Segmentation of Tongue Ultrasound and Lip Videos

Auteurs-es

  • Pertti Palo Indiana University, Department of Speech, Language and Hearing Sciences

Résumé

Compared to segmenting acoustic speech data, time domain analysis of tongue ultrasound videos and lip videos is challenging and lacks widely accepted tools. The most widely used method is to select time points for articulatory analysis on the basis of acoustic segmentation. In acoustic analysis the spectrogram provides an easy way of analysing time and frequency domain characteristics of the speech signal in one glance.

In an effort to address this discrepancy, Author (2019, 2020) have provided an analysis tool, which can be used for direct phonetic analysis of tongue ultrasound data. The tool is an application of the Euclidean distance metric to the whole ultrasound image. It can be used to easily visualise general change in the data (see Figure) and provides a good basis for segmentation.

This study extends the tool for simultaneous analysis of synchronised tongue ultrasound and lip videos. A data set from a single speaker is analysed with the new method to provide a proof-of-concept.

Biographie de l'auteur-e

Pertti Palo, Indiana University, Department of Speech, Language and Hearing Sciences

Post-doc at Indiana University, Department of Speech, Language and Hearing SciencesPhD in phonetics from Queen Margaret University, Edinburgh, Scotland

Fichiers supplémentaires

Publié-e

2021-08-30

Comment citer

1.
Palo P. Computer Assisted Segmentation of Tongue Ultrasound and Lip Videos. Canadian Acoustics [Internet]. 30 août 2021 [cité 24 août 2024];49(3):44-5. Disponible à: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/3912

Numéro

Rubrique

Actes du congrès de la Semaine canadienne d'acoustique