Hierarchical classification and system combination for automatically identifying physiological and neuromuscular laryngeal pathologies

Objectives. Speech signal processing techniques have provided several contributions to pathologic voice identification, in which healthy and unhealthy voice samples are evaluated. A less common approach is to identify laryngeal pathologies, for which the use of a noninvasive method for pathologic vo...

Full description

Bibliographic Details
Main Author: Cordeiro, Hugo (author)
Other Authors: Fonseca, José (author), Guimarães, Isabel (author), Meneses, Carlos (author)
Format: article
Language:por
Published: 2022
Subjects:
Online Access:http://hdl.handle.net/10400.26/39986
Country:Portugal
Oai:oai:comum.rcaap.pt:10400.26/39986
Description
Summary:Objectives. Speech signal processing techniques have provided several contributions to pathologic voice identification, in which healthy and unhealthy voice samples are evaluated. A less common approach is to identify laryngeal pathologies, for which the use of a noninvasive method for pathologic voice identification is an important step forward for preliminary diagnosis. In this study, a hierarchical classifier and a combination of systems are used to improve the accuracy of a three-class identification system (healthy, physiological larynx pathologies, and neuromuscular larynx pathologies). Method. Three main subject classes were considered: subjects with physiological larynx pathologies (vocal fold nodules and edemas: 59 samples), subjects with neuromuscular larynx pathologies (unilateral vocal fold paralysis: 59 samples), and healthy subjects (36 samples). The variables used in this study were a speech task (sustained vowel /a/ or continuous reading speech), features with or without perceptual information, and features with or without direct information about formants evaluated using single classifiers.Ahierarchical classification system was designed based on this information. Results. The resulting system combines an analysis of continuous speech by way of the commonly used sustained vowel /a/ to obtain spectral and perceptual speech features. It achieved an accuracy of 84.4%, which represents an improvement of approximately 9% compared with the stand-alone approach. For pathologic voice identification, the accuracy obtained was 98.7%, and the identification accuracy for the two pathology classes was 81.3%. Conclusions. Hierarchical classification and system combination create significant benefits and introduce a modular approach to the classification of larynx pathologies.