Classificação de Sotaques Brasileiros usando Redes Neurais Profundas

Wagner A. Tostes; Francisco A. Boldt; Karin S. Komati; Filipe Mutz

doi:10.20906/sbai.v1i1.2768

Wagner A. Tostes Programa de Pós-Graduação em Computação Aplicada (PPComp) Instituto Federal do Espírito Santo (IFES), Campus Serra / VixTeam Consultoria e Sistemas SA, ES
Francisco A. Boldt Programa de Pós-Graduação em Computação Aplicada (PPComp) Instituto Federal do Espírito Santo (IFES), Campus Serra, ES
Karin S. Komati Programa de Pós-Graduação em Computação Aplicada (PPComp) Instituto Federal do Espírito Santo (IFES), Campus Serra, ES
Filipe Mutz Programa de Pós-Graduação em Computação Aplicada (PPComp) Instituto Federal do Espírito Santo (IFES), Campus Serra, ES

DOI: https://doi.org/10.20906/sbai.v1i1.2768

Keywords: Accent Recognition

Abstract

The automatic classification of accents has several potential applications, for instance, the identification and authentication of users, forensic investigation tools and the selection of specialized models in text-to-speech and speech-to-text systems. In this work, several architectures of artificial neural networks were evaluated in the problem of accent classification. The performance of these architectures was compared with the methods GMM-UBM, GMM-SVM and iVector. Experimental results show that 5 out of 6 architectures achieve better values of accuracy, precision and recall than the previous methods. The best architecture reached 90\% of accuracy, with precision, recall and F1-score of 0.92, 0.84 and 0.87, respectively.