Markov blanket
{{Short description|Subset of variables that contains all the useful information}}
Image:Diagram of a Markov blanket.svg, the Markov boundary of node A includes its parents, children and the other parents of all of its children.]]
In statistics and machine learning, when one wants to infer a random variable with a set of variables, usually a subset is enough, and other variables are useless. Such a subset that contains all the useful information is called a Markov blanket. If a Markov blanket is minimal, meaning that it cannot drop any variable without losing information, it is called a Markov boundary. Identifying a Markov blanket or a Markov boundary helps to extract useful features. The terms of Markov blanket and Markov boundary were coined by Judea Pearl in 1988.{{cite book |last=Pearl |first=Judea |authorlink=Judea Pearl |title=Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference |publisher=Morgan Kaufmann |location=San Mateo CA |year=1988 |isbn=0-934613-73-7 |series=Representation and Reasoning Series |url-access=registration |url=https://archive.org/details/probabilisticrea00pear }} A Markov blanket can be constituted by a set of Markov chains.
Markov blanket
A Markov blanket of a random variable in a random variable set is any subset of , conditioned on which other variables are independent with :
It means that contains at least all the information one needs to infer , where the variables in are redundant.
In general, a given Markov blanket is not unique. Any set in that contains a Markov blanket is also a Markov blanket itself. Specifically, is a Markov blanket of in .
Markov boundary
A Markov boundary of in is a subset of , such that itself is a Markov blanket of , but any proper subset of is not a Markov blanket of . In other words, a Markov boundary is a minimal Markov blanket.
The Markov boundary of a node in a Bayesian network is the set of nodes composed of 's parents, 's children, and 's children's other parents. In a Markov random field, the Markov boundary for a node is the set of its neighboring nodes. In a dependency network, the Markov boundary for a node is the set of its parents.
= Uniqueness of Markov boundary =
The Markov boundary always exists. Under some mild conditions, the Markov boundary is unique. However, for most practical and theoretical scenarios multiple Markov boundaries may provide alternative solutions.{{cite journal |last1=Statnikov |first1=Alexander |last2=Lytkin |first2=Nikita I. |last3=Lemeire |first3=Jan |last4=Aliferis |first4=Constantin F. |title=Algorithms for discovery of multiple Markov boundaries |journal=Journal of Machine Learning Research |date=2013 |volume=14 |pages=499-566 |url=http://www.jmlr.org/papers/volume14/statnikov13a/statnikov13a.pdf}} When there are multiple Markov boundaries, quantities measuring causal effect could fail.{{cite journal |last1=Wang |first1=Yue |last2=Wang |first2=Linbo |title=Causal inference in degenerate systems: An impossibility result |journal=Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics |date=2020 |pages=3383-3392 |url=http://proceedings.mlr.press/v108/wang20i.html}}