Classificação de Sotaques Brasileiros usando Redes Neurais Profundas
Wagner A. Tostes
Programa de Pós-Graduação em Computação Aplicada (PPComp) Instituto Federal do Espírito Santo (IFES), Campus Serra / VixTeam Consultoria e Sistemas SA, ES
Francisco A. Boldt
Programa de Pós-Graduação em Computação Aplicada (PPComp) Instituto Federal do Espírito Santo (IFES), Campus Serra, ES
Karin S. Komati
Programa de Pós-Graduação em Computação Aplicada (PPComp) Instituto Federal do Espírito Santo (IFES), Campus Serra, ES
Filipe Mutz
Programa de Pós-Graduação em Computação Aplicada (PPComp) Instituto Federal do Espírito Santo (IFES), Campus Serra, ES
The automatic classification of accents has several potential applications, for instance, the identification and authentication of users, forensic investigation tools and the selection of specialized models in text-to-speech and speech-to-text systems. In this work, several architectures of artificial neural networks were evaluated in the problem of accent classification. The performance of these architectures was compared with the methods GMM-UBM, GMM-SVM and iVector. Experimental results show that 5 out of 6 architectures achieve better values of accuracy, precision and recall than the previous methods. The best architecture reached 90\% of accuracy, with precision, recall and F1-score of 0.92, 0.84 and 0.87, respectively.