Métodos de Extração de Características e Classificação Automática de Desordens Vocais

Rafael Alberto dos Santos; Paulo Rogério Scalassara; Wagner Endo

doi:10.20906/CBA2022/3566

Rafael Alberto dos Santos Departamento Acadêmico da Elétrica, Universidade Tecnológica Federal do Paraná, PR
Paulo Rogério Scalassara Departamento Acadêmico da Elétrica, Universidade Tecnológica Federal do Paraná, PR
Wagner Endo Departamento Acadêmico da Elétrica, Universidade Tecnológica Federal do Paraná, PR

DOI: https://doi.org/10.20906/CBA2022/3566

Keywords: wavelet, mel spectrogram, cepstral coefficients, support vector machine, voice, pathology

Abstract

Vocal disorders may be present when a person’s voice fails to fulfill its basic role of communication. These disorders can be detected by the variation of perceptual parameters of the voice, such as quality, pitch, and loudness. Changes in voice parameters can be measured and classified automatically through acoustic analysis. In this study, we compare feature extraction techniques for the automatic classification of voice disorders by means of support vector machines. These techniques are based on wavelet variance, mel spectrogram, and mel frequency cepstral coefficients. The voice signals are sustained vowel “a” utterances, with neutral pitch, belonging to healthy and pathological classes, specifically nodule on the vocal folds and Reinke’s edema. These pathologies affect the vocal folds and alter acoustical parameters of voice signals. Using wavelet variance and mel spectrogram patterns the average resulting classification accuracy values were 84.4%, what was higher than the 82.2% obtained using the mel frequency cepstral coefficients.