Markov blanket

{{Short description|Subset of variables that contains all the useful information}}

Image:Diagram of a Markov blanket.svg, the Markov boundary of node A includes its parents, children and the other parents of all of its children.]]

In statistics and machine learning, when one wants to infer a random variable with a set of variables, usually a subset is enough, and other variables are useless. Such a subset that contains all the useful information is called a Markov blanket. If a Markov blanket is minimal, meaning that it cannot drop any variable without losing information, it is called a Markov boundary. Identifying a Markov blanket or a Markov boundary helps to extract useful features. The terms of Markov blanket and Markov boundary were coined by Judea Pearl in 1988.{{cite book |last=Pearl |first=Judea |authorlink=Judea Pearl |title=Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference |publisher=Morgan Kaufmann |location=San Mateo CA |year=1988 |isbn=0-934613-73-7 |series=Representation and Reasoning Series |url-access=registration |url=https://archive.org/details/probabilisticrea00pear }} A Markov blanket can be constituted by a set of Markov chains.

Markov blanket

A Markov blanket of a random variable Y in a random variable set \mathcal{S}=\{X_1,\ldots,X_n\} is any subset \mathcal{S}_1 of \mathcal{S}, conditioned on which other variables are independent with Y:

Y\perp \!\!\! \perp\mathcal{S}\backslash\mathcal{S}_1 \mid \mathcal{S}_1.

It means that \mathcal{S}_1 contains at least all the information one needs to infer Y, where the variables in \mathcal{S}\backslash\mathcal{S}_1 are redundant.

In general, a given Markov blanket is not unique. Any set in \mathcal{S} that contains a Markov blanket is also a Markov blanket itself. Specifically, \mathcal{S} is a Markov blanket of Y in \mathcal{S}.

Markov boundary

A Markov boundary of Y in \mathcal{S} is a subset \mathcal{S}_2 of \mathcal{S}, such that \mathcal{S}_2 itself is a Markov blanket of Y, but any proper subset of \mathcal{S}_2 is not a Markov blanket of Y. In other words, a Markov boundary is a minimal Markov blanket.

The Markov boundary of a node A in a Bayesian network is the set of nodes composed of A's parents, A's children, and A's children's other parents. In a Markov random field, the Markov boundary for a node is the set of its neighboring nodes. In a dependency network, the Markov boundary for a node is the set of its parents.

= Uniqueness of Markov boundary =

The Markov boundary always exists. Under some mild conditions, the Markov boundary is unique. However, for most practical and theoretical scenarios multiple Markov boundaries may provide alternative solutions.{{cite journal |last1=Statnikov |first1=Alexander |last2=Lytkin |first2=Nikita I. |last3=Lemeire |first3=Jan |last4=Aliferis |first4=Constantin F. |title=Algorithms for discovery of multiple Markov boundaries |journal=Journal of Machine Learning Research |date=2013 |volume=14 |pages=499-566 |url=http://www.jmlr.org/papers/volume14/statnikov13a/statnikov13a.pdf}} When there are multiple Markov boundaries, quantities measuring causal effect could fail.{{cite journal |last1=Wang |first1=Yue |last2=Wang |first2=Linbo |title=Causal inference in degenerate systems: An impossibility result |journal=Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics |date=2020 |pages=3383-3392 |url=http://proceedings.mlr.press/v108/wang20i.html}}

See also

Notes