Fast Markov Blanket Discovery Without Causal Sufficiency

  • Pedro V. B. Jeronymo University of São Paulo
  • Carlos D. Maciel University of São Paulo
Keywords: Big data, Causal discovery, Feature selection, Markov blanket


Faster feature selection algorithms become a necessity as Big Data dictates the zeitgeist. An important class of feature selectors are Markov Blanket (MB) learning algorithms. They are Causal Discovery algorithms that learn the local causal structure of a target variable. A common assumption in their theoretical basis, yet often violated in practice, is causal sufficiency: the requirement that all common causes of the measured variables in the dataset are also in the dataset. Recently, Yu et al. (2018) proposed the M3B algorithm, the first to directly learn the MB without demanding causal sufficiency. The main drawback of M3B is that it is time inefficient, being intractable for high-dimensional inputs. In this paper, we derive the Fast Markov Blanket Discovery Algorithm (FMMB). Empirical results that compare FMMB to M3B on the structural learning task show that FMMB outperforms M3B in terms of time efficiency while preserving structural accuracy. Five real-world datasets where used to contrast both algorithms as feature selectors. Applying NB and SVM classifiers, FMMB achieved a competitive outcome. This method mitigates the curse of dimensionality and inspires the development of local-toglobal algorithms.