Predictive mean matching

{{Machine learning bar}}

Predictive mean matching (PMM){{cite web|url=http://stefvanbuuren.name/fimd/sec-pmm.html|website=stefvanbuuren.name|title=3.4 Predictive mean matching|accessdate=30 June 2019}} is a widely used{{cite web|url=https://apps.webofknowledge.com/Search.do?product=UA&SID=C4t3fjbEtlzE2IBRqfF&search_mode=GeneralSearch&prID=ab90c05a-ae5f-4d37-9f59-45f422547688|title=Web of Science [v.5.32] – All Databases Results|website=apps.webofknowledge.com|accessdate=30 June 2019}} statistical imputation method for missing values, first proposed by Donald B. Rubin in 1986{{cite journal|title=Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations|first=Donald B.|last=Rubin|date=30 June 1986|journal=Journal of Business & Economic Statistics|volume=4|issue=1|pages=87–94|doi=10.2307/1391390|jstor=1391390}} and R. J. A. Little in 1988.{{cite journal|title=Missing-Data Adjustments in Large Surveys|first=Roderick J. A.|last=Little|date=30 June 1988|journal=Journal of Business & Economic Statistics|volume=6|issue=3|pages=287–296|doi=10.2307/1391878|jstor=1391878}}

It aims to reduce the bias introduced in a dataset through imputation, by drawing real values sampled from the data.{{cite web|url=https://statisticalhorizons.com/predictive-mean-matching|title=Imputation by Predictive Mean Matching: Promise & Peril – Statistical Horizons|website=statisticalhorizons.com|accessdate=30 June 2019}} This is achieved by building a small subset of observations where the outcome variable matches the outcome of the observations with missing values.

Compared to other imputation methods, it usually imputes less implausible values (e.g. negative incomes) and takes heteroscedastic data into account more appropriately.{{Cite web|title=Predictive Mean Matching Imputation (Example in R)|url=https://statisticsglobe.com/predictive-mean-matching-imputation-method/|access-date=2020-09-18|website=Statistics Globe|language=en-US}}

References

{{Reflist}}

{{DEFAULTSORT:Predictive mean matching}}

Category:Missing data

Category:Predictive analytics