Machine Learning Applied to the Early Diagnosis of Leukemia Using Biomarkers

  • Fernanda T. Ferry Computer Science - UniBH | Centro Universitário de Belo Horizonte, Belo Horizonte - MG
  • Giulia Zanon Castro Graduate Program in Electrical Engineering - Universidade Federal de Minas Gerais, Belo Horizonte, MG
  • Ramon Gonçalves Pereira Computer Science - UniBH | Centro Universitário de Belo Horizonte, Belo Horizonte - MG
Keywords: Machine Learning, Leukemia, Earlier Diagnoses, Explainability, Decision Making, Complete Blood Count

Abstract

Leukemia is a rare and lethal blood cancer. One of the factors that increase the patient's chances of better treatment results is early diagnosis. The best attempting to discover the leukemia usually is the image analysis exams but this is costly and sometimes its late. Thus, this paper uses attributes of a complete blood count as input to Machine Learning algorithms to predict earlier and cheaper diagnoses of Leukemia. In this paper, we collected real exam results and developed a synthetic dataset with 1000 examples based on the distribution and limits of each attribute to classify a patient in positive or negative for Leukemia. We tested four different classifiers (Logistic Regression, Random Forest, XGBoost and SVM) to predict sample classes. We show that it is possible with an accuracy of 96\% to predict if a patient is likely to have Leukemia based on its blood count.
Published
2021-10-20
Section
Articles