Identificação de Relevância em Textos de Sistemas de Help Desk usando Técnicas Clássicas de Aprendizado de Máquina

Marciel Mario Degasperi; Daniel Cruz Cavalieri; Fidelis Zanetti de Castro

doi:10.20906/CBA2022/3583

Marciel Mario Degasperi Instituto Federal de Educação, Ciência e Tecnologia do Espírito Santo, Campus Serra, ES
Daniel Cruz Cavalieri Instituto Federal de Educação, Ciência e Tecnologia do Espírito Santo, Campus Serra, ES
Fidelis Zanetti de Castro Instituto Federal de Educação, Ciência e Tecnologia do Espírito Santo, Campus Serra, ES

DOI: https://doi.org/10.20906/CBA2022/3583

Keywords: Machine Learning, Natural Language Processing, Service Desk Systems, Classification

Abstract

Service Desk systems have a rich information base made up of the history of calls made, which can and should be used as a reference base for subsequent calls. Standard search tools, such as keyword searches, prove to be unfeasible for searching large databases, due to the long query time and the return of results unrelated to the problem. This work aims to investigate the ability of some classical classification algorithms to find the characteristic defined here as “relevance”: the characteristic of texts with some knowledge that can be reused. The motivation is that non-relevant texts can be removed early from the dataset, allowing complex algorithms to be employed on a smaller amount of information. In the tests performed, the Naive-Bayes, Adaptive Boosting, Random Forest, Stochastic Gradient Descent, Logistic Regression, Support Vector Machine, and Light Gradient Boosting Machine classifiers were used. The classifiers showed accuracy below 0.8, indicating that, in this scenario, other more efficient approaches should be used.