Summary: | The most promising approaches for surface Electromyography (EMG) based speech interfaces commonly focus on the tongue muscles. Despite the interesting results in small vocabularies tasks, it is yet unclear which articulation gestures these sensors are actually detecting. To address these complex aspects, in this study we propose a novel method, based on synchronous acquisition of surface EMG and Ultrasound Imaging (US) of the tongue, to assess the applicability of EMG to tongue gesture detection. In this context, the US image sequences allow us to gather data concerning tongue movement over time, providing the grounds for the EMG analysis. Using this multimodal setup, we have recorded a corpus that covers several tongue transitions (e.g. back to front) in different contexts. Considering the annotated tongue movement data, the results from the EMG analysis show that tongue transitions can be detected using the EMG sensors, with some variability in terms of sensor positioning, across speakers, and the possibility of high false-positive rates.
|