Mixed model
{{Short description|Statistical model containing both fixed effects and random effects}}
{{distinguish|mixture model}}
{{Regression bar}}
A mixed model, mixed-effects model or mixed error-component model is a statistical model containing both fixed effects and random effects.{{cite book |first=Badi H. |last=Baltagi |title=Econometric Analysis of Panel Data |location=New York |publisher=Wiley |edition=Fourth |year=2008 |isbn=978-0-470-51886-1 |pages=54–55 }}{{cite journal |last1=Gomes |first1=Dylan G.E. |title=Should I use fixed effects or random effects when I have fewer than five levels of a grouping factor in a mixed-effects model? |journal=PeerJ |date=20 January 2022 |volume=10 |pages=e12794 |doi=10.7717/peerj.12794|pmid=35116198 |pmc=8784019 |doi-access=free }} These models are useful in a wide variety of disciplines in the physical, biological and social sciences.
They are particularly useful in settings where repeated measurements are made on the same statistical units (see also longitudinal study), or where measurements are made on clusters of related statistical units. Mixed models are often preferred over traditional analysis of variance regression models because they don't rely on the independent observations assumption. Further, they have their flexibility in dealing with missing values and uneven spacing of repeated measurements.{{cite journal|last1=Yang| first1= Jian|last2=Zaitlen|first2=NA|last3=Goddard|first3=ME|last4=Visscher|first4=PM|last5=Prince|first5=AL|title= Advantages and pitfalls in the application of mixed-model association methods|journal=Nat Genet|volume=46|date=29 January 2014| issue= 2| pages= 100–106|doi=10.1038/ng.2876 |pmid=24473328|pmc=3989144}} The Mixed model analysis allows measurements to be explicitly modeled in a wider variety of correlation and variance-covariance avoiding biased estimations structures.
This page will discuss mainly linear mixed-effects models rather than generalized linear mixed models or nonlinear mixed-effects models.{{cite book |last=Seltman|first=Howard| title=Experimental Design and Analysis|year=2016|volume=1|pages=357–378|url=https://www.stat.cmu.edu/~hseltman/309/Book/}}
Qualitative Description
Linear mixed models (LMMs) are statistical models that incorporate fixed and random effects to accurately represent non-independent data structures. LMM is an alternative to analysis of variance. Often, ANOVA assumes the independence of observations within each group, however, this assumption may not hold in non-independent data, such as multilevel/hierarchical, longitudinal, or correlated datasets.
Non-independent sets are ones in which the variability between outcomes is due to correlations within groups or between groups. Mixed models properly account for nest structures/hierarchical data structures where observations are influenced by their nested associations. For example, when studying education methods involving multiple schools, there are multiple levels of variables to consider. The individual level/lower level comprises individual students or teachers within the school. The observations obtained from this student/teacher is nested within their school. For example, Student A is a unit within the School A. The next higher level is the school. At the higher level, the school contains multiple individual students and teachers. The school level influences the observations obtained from the students and teachers. For Example, School A and School B are the higher levels each with its set of Student A and Student B respectively. This represents a hierarchical data scheme. A solution to modeling hierarchical data is using linear mixed models.
File:Heiarchial Data Strucutre Education.jpg
LMMs allow us to understand the important effects between and within levels while incorporating the corrections for standard errors for non-independence embedded in the data structure.{{cite web|publisher= UCLA Statistical Consulting Group |work=Advanced Research Computing Statistical Methods and Data Analytics |date=2021 |title= Introduction to Linear Mixed Models |url=https://stats.oarc.ucla.edu/other/mult-pkg/introduction-to-linear-mixed-models/}} In experimental fields such as social psychology, psycholinguistics, cognitive psychology (and neuroscience), where studies often involve multiple grouping variables, failing to account for random effects can lead to inflated Type I error rates and unreliable conclusions.{{cite journal |last1=Judd |first1=Charles M. |last2=Westfall |first2=Jacob |last3=Kenny |first3=David A. |title=Treating stimuli as a random factor in social psychology: A new and comprehensive solution to a pervasive but largely ignored problem. |journal=Journal of Personality and Social Psychology |date=2012 |volume=103 |issue=1 |pages=54–69 |doi=10.1037/a0028347}}{{Cite journal |last=Boisgontier |first=Matthieu P. |last2=Cheval |first2=Boris |date=September 2016 |title=The anova to mixed model transition |url=https://linkinghub.elsevier.com/retrieve/pii/S0149763416301634 |journal=Neuroscience & Biobehavioral Reviews |language=en |volume=68 |pages=1004–1005 |doi=10.1016/j.neubiorev.2016.05.034}} For instance, when analyzing data from experiments that involve both samples of participants and samples of stimuli (e.g., images, scenarios, etc.), ignoring variation in either of these grouping variables (e.g., by averaging over stimuli) can result in misleading conclusions. In such cases, researchers can instead treat both participant and stimulus as random effects with LMMs, and in doing so, can correctly account for the variation in their data across multiple grouping variables. Similarly, when analyzing data from comparative longitudinal surveys, failing to include random effects at all relevant levels—such as country and country-year—can significantly distort the results.{{cite journal |last1=Schmidt-Catran |first1=Alexander W. |last2=Fairbrother |first2=Malcolm |title=The Random Effects in Multilevel Models: Getting Them Wrong and Getting Them Right |journal=European Sociological Review |date=February 2016 |volume=32 |issue=1 |pages=23–38 |doi=10.1093/esr/jcv090|hdl=1983/de3e0a3c-9b41-4963-880c-452809860a7e |hdl-access=free }}
= The Fixed Effect =
Fixed effects encapsulate the tendencies/trends that are consistent at the levels of primary interest. These effects are considered fixed because they are non-random and assumed to be constant for the population being studied. For example, when studying education a fixed effect could represent overall school level effects that are consistent across all schools.
While the hierarchy of the data set is typically obvious, the specific fixed effects that affect the average responses for all subjects must be specified. Some fixed effect coefficients are sufficient without corresponding random effects where as other fixed coefficients only represent an average where the individual units are random. These may be determined by incorporating random intercepts and slopes.{{cite book|last=Kreft & de Leeuw|first=J.|title= Introducing multilevel modeling|publisher=London:Sage}}{{cite book|last=Raudenbush, Bryk|first=S.W, A.S|title= Hierarchical Linear Models: Applications and Data Analysis Methods| year=2002|publisher=Thousand Oaks, CA: Sage}}{{cite book| last=Snijders, Bosker|first=T.A.B, R.J|title=Multilevel analysis: An introduction to basic and advanced multilevel modeling |volume=2nd edition|year=2012|publisher=London:Sage}}
In most situations, several related models are considered and the model that best represents a universal model is adopted.
= The Random Effect, ε =
A key component of the mixed model is the incorporation of random effects with the fixed effect. Fixed effects are often fitted to represent the underlying model. In Linear mixed models, the true regression of the population is linear, β. The fixed data is fitted at the highest level. Random effects introduce statistical variability at different levels of the data hierarchy. These account for the unmeasured sources of variance that affect certain groups in the data. For example, the differences between student 1 and student 2 in the same class, or the differences between class 1 and class 2 in the same school.
History and current status
File:Bias described using LMM.jpg
Ronald Fisher introduced random effects models to study the correlations of trait values between relatives.{{cite journal | last=Fisher | first=RA | title=The correlation between relatives on the supposition of Mendelian inheritance | journal=Transactions of the Royal Society of Edinburgh | year=1918 | volume=52 | pages=399–433 | doi=10.1017/S0080456800012163 | issue=2| s2cid=181213898 | url=https://zenodo.org/record/1428666 }} In the 1950s, Charles Roy Henderson
provided best linear unbiased estimates of fixed effects and best linear unbiased predictions of random effects.{{cite journal | last=Robinson | first=G.K. | title=That BLUP is a Good Thing: The Estimation of Random Effects | journal=Statistical Science | volume=6 | issue=1 | year=1991 | pages=15–32 | jstor=2245695 | doi=10.1214/ss/1177011926| doi-access=free }}{{cite journal | title = The Estimation of Environmental and Genetic Trends from Records Subject to Culling | journal = Biometrics | volume = 15 | year = 1959 |pages = 192–218 | jstor=2527669 |author1=C. R. Henderson |author2=Oscar Kempthorne |author3=S. R. Searle |author4=C. M. von Krosigk | doi = 10.2307/2527669 | issue = 2 | publisher = International Biometric Society}}{{cite web | url = http://books.nap.edu/html/biomems/chenderson.pdf | title = Charles Roy Henderson, April 1, 1911 – March 14, 1989 | author = L. Dale Van Vleck | publisher = United States National Academy of Sciences}}{{cite journal | last=McLean | first=Robert A. |author2=Sanders, William L. |author3=Stroup, Walter W. | title= A Unified Approach to Mixed Linear Models | journal=The American Statistician | year=1991 | volume=45 | pages=54–64 | jstor=2685241 | doi=10.2307/2685241 | issue=1 | publisher=American Statistical Association}} Subsequently, mixed modeling has become a major area of statistical research, including work on computation of maximum likelihood estimates, non-linear mixed effects models, missing data in mixed effects models, and Bayesian estimation of mixed effects models. Mixed models are applied in many disciplines where multiple correlated measurements are made on each unit of interest. They are prominently used in research involving human and animal subjects in fields ranging from genetics to marketing, and have also been used in baseball {{cite web |last=Anderson|first=R.J|title= "MLB analytics guru who could be the next Nate Silver has a revolutionary new stat"|year=2016|url= http://www.cbssports.com/mlb/news/mlb-analytics-guru-who-could-be-the-next-nate-silver-has-a-revolutionary-new-stat/}} and industrial statistics.{{cite book |last=Obenchain, Lilly|first=Bob, Eli|title="Data Analysis and Information Visualization"|year=1993|publisher=MWSUG|url= https://www.lexjansen.com/mwsug/1993/MWSUG93035.pdf}}
The mixed linear model association has improved the prevention of false positive associations. Populations are deeply interconnected and the relatedness structure of population dynamics is extremely difficult to model without the use of mixed models. Linear mixed models may not, however, be the only solution. LMM's have a constant-residual variance assumption that is sometimes violated when accounting for deeply associated continuous and binary traits.{{cite journal|last1= Chen|last2=Wang|last3=Conomos|last4=Stilp|last5=Li|last6=Sofer|last7=Szpiro|last8=Chen|last9=Brehm|last10=Celedon|last11=Redline|last12=Papanicolaou|last13=Thorton|last14=Thorton|last15=Laurie|last16=Rice|last17=Lin|first1=H|first2=C|first3=MP|first4=AM|first5=Z|first6=T|first7=AA|first8=W|first9=JM|first10=JC|first11=S|first12=S|first13=GJ|first14=TA|first15=CC|first16=K|first17=X|title=Control for Population Structure and Relatedness for Binary Traits in Genetic Association Studies via Logistic Mixed Models|journal= Am J Hum Genet|date= 7 April 2016|volume=98|issue=4 |pages=653–666 |doi= 10.1016/j.ajhg.2016.02.012|pmid= 27018471|pmc=4833218}}
Definition
In matrix notation a linear mixed model can be represented as
:
where
- is a known vector of observations, with mean ;
- is an unknown vector of fixed effects;
- is an unknown vector of random effects, with mean and variance–covariance matrix ;
- is an unknown vector of random errors, with mean and variance ;
- is the known design matrix for the fixed effects relating the observations to , respectively
- is the known design matrix for the random effects relating the observations to , respectively.
For example, if each observation can belong to any zero or more of {{mvar|k}} categories then {{mvar|Z}}, which has one row per observation, can be chosen to have {{mvar|k}} columns, where a value of {{math|1}} for a matrix element of {{mvar|Z}} indicates that an observation is known to belong to a category and a value of {{math|0}} indicates that an observation is known to not belong to a category. The inferred value of {{mvar|u}} for a category is then a category-specific intercept. If {{mvar|Z}} has additional columns, where the non-zero values are instead the value of an independent variable for an observation, then the corresponding inferred value of {{mvar|u}} is a category-specific slope for that independent variable. The prior distribution for the category intercepts and slopes is described by the covariance matrix {{mvar|G}}.
Estimation
The joint density of and can be written as: .
Assuming normality, , and , and maximizing the joint density over and , gives Henderson's "mixed model equations" (MME) for linear mixed models:{{cite journal |last=Henderson |first=C R |date=1973 |title=Sire evaluation and genetic trends |url=http://www.journalofanimalscience.org/content/1973/Symposium/10.full.pdf |journal=Journal of Animal Science |publisher=American Society of Animal Science |volume=1973 |pages=10–41 |access-date=17 August 2014|doi=10.1093/ansci/1973.Symposium.10 }}
:
\begin{pmatrix}
X'R^{-1}X & X'R^{-1}Z \\
Z'R^{-1}X & Z'R^{-1}Z + G^{-1}
\end{pmatrix}
\begin{pmatrix}
\hat{\boldsymbol{\beta}} \\
\hat{\boldsymbol{u}}
\end{pmatrix}
= \begin{pmatrix}
X'R^{-1}\boldsymbol{y} \\
Z'R^{-1}\boldsymbol{y}
\end{pmatrix}
where for example {{mvar|X′}} is the matrix transpose of {{mvar|X}} and {{math|R{{sup|−1}}}} is the matrix inverse of {{mvar|R}}.
The solutions to the MME, and are best linear unbiased estimates and predictors for and , respectively. This is a consequence of the Gauss–Markov theorem when the conditional variance of the outcome is not scalable to the identity matrix. When the conditional variance is known, then the inverse variance weighted least squares estimate is best linear unbiased estimates. However, the conditional variance is rarely, if ever, known. So it is desirable to jointly estimate the variance and weighted parameter estimates when solving MMEs.
=Choice of random effects structure=
One choice that analysts face with mixed models is which random effects (i.e., grouping variables, random intercepts, and random slopes) to include. One prominent recommendation in the context of confirmatory hypothesis testing{{cite journal |last1=Barr |first1=Dale J. |last2=Levy |first2=Roger |last3=Scheepers |first3=Christoph |last4=Tily |first4=Harry J. |title=Random effects structure for confirmatory hypothesis testing: Keep it maximal |journal=Journal of Memory and Language |date=April 2013 |volume=68 |issue=3 |pages=255–278 |doi=10.1016/j.jml.2012.11.001|pmc=3881361 }} is to adopt a "maximal" random effects structure, including all possible random effects justified by the experimental design, as a means to control Type I error rates.
=Software=
One method used to fit such mixed models is that of the expectation–maximization algorithm (EM) where the variance components are treated as unobserved nuisance parameters in the joint likelihood.{{cite journal | title=Newton–Raphson and EM algorithms for linear mixed-effects models for repeated-measures data | last=Lindstrom | first=ML |author2=Bates, DM | journal= Journal of the American Statistical Association| volume=83 | year=1988 | pages=1014–1021 | issue=404 | doi=10.1080/01621459.1988.10478693}} Currently, this is the method implemented in statistical software such as Python (statsmodels package) and SAS (proc mixed), and as initial step only in R's nlme package lme(). The solution to the mixed model equations is a maximum likelihood estimate when the distribution of the errors is normal.{{cite journal | title=Random-Effects Models for Longitudinal Data | last=Laird | first=Nan M. |author2=Ware, James H. | journal=Biometrics | volume=38 | year=1982 | pages=963–974 | jstor=2529876 | doi=10.2307/2529876 | issue=4 | publisher=International Biometric Society | pmid=7168798}}{{cite book |first1=Garrett M. |last1=Fitzmaurice |first2=Nan M. |last2=Laird |first3=James H. |last3=Ware |year=2004 |title=Applied Longitudinal Analysis |publisher=John Wiley & Sons |pages=326–328 }}
There are several other methods to fit mixed models, including using a mixed effect model (MEM) initially, and then Newton-Raphson (used by R package nlme{{cite book |last1=Pinheiro|first1=J |last2=Bates |first2=DM |year=2006 |title=Mixed-effects models in S and S-PLUS |series=Statistics and Computing |location=New York |publisher=Springer Science & Business Media |doi=10.1007/b98882 |isbn=0-387-98957-9 }}'s lme()), penalized least squares to get a profiled log likelihood only depending on the (low-dimensional) variance-covariance parameters of , i.e., its cov matrix , and then modern direct optimization for that reduced objective function (used by R's lme4{{cite journal | title=Fitting Linear Mixed-Effects Models Using lme4 | last=Bates | first=D. | author2=Maechler, M. | author3=Bolker, B. | author4=Walker, S. | journal=Journal of Statistical Software | volume=67 | year=2015 | issue=1 | doi=10.18637/jss.v067.i01 | doi-access=free | hdl=2027.42/146808 | hdl-access=free }} package lmer() and the Julia package MixedModels.jl) and direct optimization of the likelihood (used by e.g. R's glmmTMB). Notably, while the canonical form proposed by Henderson is useful for theory, many popular software packages use a different formulation for numerical computation in order to take advantage of sparse matrix methods (e.g. lme4 and MixedModels.jl).
In the context of Bayesian methods, the brms package provides a user-friendly interface for fitting mixed models in R using Stan, allowing for the incorporation of prior distributions and the estimation of posterior distributions.{{cite journal |last1=Bürkner |first1=Paul-Christian |title=brms : An R Package for Bayesian Multilevel Models Using Stan |journal=Journal of Statistical Software |date=2017 |volume=80 |issue=1 |doi=10.18637/jss.v080.i01|doi-access=free }}{{cite journal |last1=Bürkner |first1=Paul-Christian |title=Advanced Bayesian Multilevel Modeling with the R Package brms |journal=The R Journal |date=2018 |volume=10 |issue=1 |pages=395 |doi=10.32614/RJ-2018-017|arxiv=1705.11123 }} In python, Bambi provides a similarly streamlined approach for fitting mixed effects models using PyMC.{{Citation |last=Capretto |first=Tomás |title=Bambi: A simple interface for fitting Bayesian linear models in Python |date=2022-01-11 |url=https://arxiv.org/abs/2012.10754 |access-date=2025-01-11 |doi=10.48550/arXiv.2012.10754 |last2=Piho |first2=Camen |last3=Kumar |first3=Ravin |last4=Westfall |first4=Jacob |last5=Yarkoni |first5=Tal |last6=Martin |first6=Osvaldo A.}}
See also
References
{{Reflist}}
Further reading
- {{cite book |first1=Andrzej |last1=Gałecki |first2=Tomasz |last2=Burzykowski |title=Linear Mixed-Effects Models Using R: A Step-by-Step Approach |location=New York |publisher=Springer |year=2013 |isbn=978-1-4614-3900-4 }}
- {{cite book |last1=Milliken |first1=G. A. |last2=Johnson |first2=D. E. |year=1992 |title=Analysis of Messy Data: Vol. I. Designed Experiments |location=New York |publisher=Chapman & Hall }}
- {{cite book |last1=West |first1=B. T. |last2=Welch |first2=K. B. |last3=Galecki |first3=A. T. |year=2007 |title=Linear Mixed Models: A Practical Guide Using Statistical Software |location=New York |publisher=Chapman & Hall/CRC }}
{{DEFAULTSORT:Mixed Model}}