Meta-analysis

{{Short description|Statistical method that summarizes and/or integrates data from multiple sources}}

{{For|the process in historical linguistics known as metanalysis|Rebracketing}}

{{Research}}Meta-analysis is a method of synthesis of quantitative data from multiple independent studies addressing a common research question. An important part of this method involves computing a combined effect size across all of the studies. As such, this statistical approach involves extracting effect sizes and variance measures from various studies. By combining these effect sizes the statistical power is improved and can resolve uncertainties or discrepancies found in individual studies. Meta-analyses are integral in supporting research grant proposals, shaping treatment guidelines, and influencing health policies. They are also pivotal in summarizing existing research to guide future studies, thereby cementing their role as a fundamental methodology in metascience. Meta-analyses are often, but not always, important components of a systematic review.

History

The term "meta-analysis" was coined in 1976 by the statistician Gene Glass,{{cite journal |vauthors=Shadish WR, Lecy JD |date=September 2015 |title=The meta-analytic big bang |journal=Research Synthesis Methods |volume=6 |issue=3 |pages=246–264 |doi=10.1002/jrsm.1132 |pmid=26212600 |s2cid=5416879}}{{cite journal |vauthors=Glass GV |date=September 2015 |title=Meta-analysis at middle age: a personal history |journal=Research Synthesis Methods |volume=6 |issue=3 |pages=221–231 |doi=10.1002/jrsm.1133 |pmid=26355796 |s2cid=30083129}} who stated "Meta-analysis refers to the analysis of analyses".{{cite journal |vauthors=Glass GV |year=1976 |title=Primary, secondary, and meta-analysis of research |journal=Educational Researcher |volume=5 |issue=10 |pages=3–8 |doi=10.3102/0013189X005010003 |s2cid=3185455}} Glass's work aimed at describing aggregated measures of relationships and effects.{{Cite book |last=Hunt |first=Morton |title=How science takes stock: the story of meta-analysis |publisher=Russell Sage Foundation |year=1997 |edition=1st |location=New York, New York, United States of America |language=en-US}} While Glass is credited with authoring the first modern meta-analysis, a paper published in 1904 by the statistician Karl Pearson in the British Medical Journal{{cite journal |vauthors= |date=November 1904 |title=Report on Certain Enteric Fever Inoculation Statistics |journal=British Medical Journal |volume=2 |issue=2288 |pages=1243–1246 |doi=10.1136/bmj.2.2288.1243 |pmc=2355479 |pmid=20761760}} collated data from several studies of typhoid inoculation and is seen as the first time a meta-analytic approach was used to aggregate the outcomes of multiple clinical studies.{{cite journal |vauthors=Nordmann AJ, Kasenda B, Briel M |date=9 March 2012 |title=Meta-analyses: what they can and cannot do |journal=Swiss Medical Weekly |volume=142 |pages=w13518 |doi=10.4414/smw.2012.13518 |pmid=22407741 |doi-access=free}}{{cite journal |vauthors=O'Rourke K |date=December 2007 |title=An historical perspective on meta-analysis: dealing quantitatively with varying study results |journal=Journal of the Royal Society of Medicine |volume=100 |issue=12 |pages=579–582 |doi=10.1177/0141076807100012020 |pmc=2121629 |pmid=18065712}} Numerous other examples of early meta-analyses can be found including occupational aptitude testing,Ghiselli, E. E. (1955). The measurement of occupational aptitude. University of California Publications in Psychology, 8, 101–216.{{Cite journal |last=Ghiselli |first=Edwin E. |date=1973 |title=The Validity of Aptitude Tests in Personnel Selection |url=https://onlinelibrary.wiley.com/doi/10.1111/j.1744-6570.1973.tb01150.x |journal=Personnel Psychology |language=en |volume=26 |issue=4 |pages=461–477 |doi=10.1111/j.1744-6570.1973.tb01150.x |issn=0031-5826}} and agriculture.{{Cite journal |last1=Yates |first1=F. |last2=Cochran |first2=W. G. |date=1938 |title=The analysis of groups of experiments |url=https://www.cambridge.org/core/product/identifier/S0021859600050978/type/journal_article |journal=The Journal of Agricultural Science |language=en |volume=28 |issue=4 |pages=556–580 |doi=10.1017/S0021859600050978 |s2cid=86619593 |issn=0021-8596}}

The first model meta-analysis was published in 1978 on the effectiveness of psychotherapy outcomes by Mary Lee Smith and Gene Glass.{{Cite journal |last1=Smith |first1=Mary L. |last2=Glass |first2=Gene V. |date=1977 |title=Meta-analysis of psychotherapy outcome studies. |url=http://doi.apa.org/getdoi.cfm?doi=10.1037/0003-066X.32.9.752 |journal=American Psychologist |language=en |volume=32 |issue=9 |pages=752–760 |doi=10.1037/0003-066X.32.9.752 |pmid=921048 |s2cid=43326263 |issn=1935-990X}} After publication of their article there was pushback on the usefulness and validity of meta-analysis as a tool for evidence synthesis. The first example of this was by Hans Eysenck who in a 1978 article in response to the work done by Mary Lee Smith and Gene Glass called meta-analysis an "exercise in mega-silliness".{{Cite journal |last=Eysenck |first=H. J. |date=1978 |title=An exercise in mega-silliness. |url=http://doi.apa.org/getdoi.cfm?doi=10.1037/0003-066X.33.5.517.a |journal=American Psychologist |language=en |volume=33 |issue=5 |pages=517 |doi=10.1037/0003-066X.33.5.517.a |issn=1935-990X}}{{Cite journal |last1=Sharpe |first1=Donald |last2=Poets |first2=Sarena |date=2020 |title=Meta-analysis as a response to the replication crisis. |url=http://doi.apa.org/getdoi.cfm?doi=10.1037/cap0000215 |journal=Canadian Psychology / Psychologie Canadienne |language=en |volume=61 |issue=4 |pages=377–387 |doi=10.1037/cap0000215 |s2cid=225384392 |issn=1878-7304}} Later Eysenck would refer to meta-analysis as "statistical alchemy".{{Cite journal |last=Eysenck |first=H.J. |date=1995 |title=Meta-analysis or best-evidence synthesis? |url=https://onlinelibrary.wiley.com/doi/10.1111/j.1365-2753.1995.tb00005.x |journal=Journal of Evaluation in Clinical Practice |language=en |volume=1 |issue=1 |pages=29–36 |doi=10.1111/j.1365-2753.1995.tb00005.x |pmid=9238555 |issn=1356-1294}} Despite these criticisms the use of meta-analysis has only grown since its modern introduction. By 1991 there were 334 published meta-analyses; this number grew to 9,135 by 2014.{{Cite journal |last=Ioannidis |first=John P.A. |date=2016 |title=The Mass Production of Redundant, Misleading, and Conflicted Systematic Reviews and Meta-analyses |journal=The Milbank Quarterly |language=en |volume=94 |issue=3 |pages=485–514 |doi=10.1111/1468-0009.12210 |issn=0887-378X |pmc=5020151 |pmid=27620683}}

The field of meta-analysis expanded greatly since the 1970s and touches multiple disciplines including psychology, medicine, and ecology. Further the more recent creation of evidence synthesis communities has increased the cross pollination of ideas, methods, and the creation of software tools across disciplines.{{cite journal |vauthors=Vandvik PO, Brandt L |date=July 2020 |title=Future of Evidence Ecosystem Series: Evidence ecosystems and learning health systems: why bother? |journal=Journal of Clinical Epidemiology |volume=123 |pages=166–170 |doi=10.1016/j.jclinepi.2020.02.008 |pmid=32145365 |s2cid=212629387}}{{cite journal |vauthors=Cartabellotta A, Tilson JK |date=June 2019 |title=The ecosystem of evidence cannot thrive without efficiency of knowledge generation, synthesis, and translation |journal=Journal of Clinical Epidemiology |volume=110 |pages=90–95 |doi=10.1016/j.jclinepi.2019.01.008 |pmid=30708174 |s2cid=73415319}}{{cite journal |display-authors=6 |vauthors=Haddaway NR, Bannach-Brown A, Grainger MJ, Hamilton WK, Hennessy EA, Keenan C, Pritchard CC, Stojanova J |date=June 2022 |title=The evidence synthesis and meta-analysis in R conference (ESMARConf): levelling the playing field of conference accessibility and equitability |journal=Systematic Reviews |volume=11 |issue=1 |pages=113 |doi=10.1186/s13643-022-01985-6 |pmc=9164457 |pmid=35659294 |doi-access=free}}

Literature search

One of the most important steps of a meta-analysis is data collection. For an efficient database search, appropriate keywords and search limits need to be identified.{{Cite journal |last1=Grames |first1=Eliza M. |last2=Stillman |first2=Andrew N. |last3=Tingley |first3=Morgan W. |last4=Elphick |first4=Chris S. |date=2019 |editor-last=Freckleton |editor-first=Robert |title=An automated approach to identifying search terms for systematic reviews using keyword co-occurrence networks |journal=Methods in Ecology and Evolution |language=en |volume=10 |issue=10 |pages=1645–1654 |doi=10.1111/2041-210X.13268 |bibcode=2019MEcEv..10.1645G |issn=2041-210X|doi-access=free }} The use of Boolean operators and search limits can assist the literature search.{{Cite journal |last1=Sood |first1=Amit |last2=Erwin |first2=Patricia J. |last3=Ebbert |first3=Jon O. |date=2004 |title=Using Advanced Search Tools on PubMed for Citation Retrieval |journal=Mayo Clinic Proceedings |language=en |volume=79 |issue=10 |pages=1295–1300 |doi=10.4065/79.10.1295|pmid=15473412 |doi-access=free }}{{Cite journal |last1=Vincent |first1=Beatriz |last2=Vincent |first2=Maurice |last3=Ferreira |first3=Carlos Gil |date=2006-03-01 |title=Making PubMed Searching Simple: Learning to Retrieve Medical Literature Through Interactive Problem Solving |journal=The Oncologist |language=en |volume=11 |issue=3 |pages=243–251 |doi=10.1634/theoncologist.11-3-243 |pmid=16549808 |issn=1083-7159|doi-access=free }} A number of databases are available (e.g., PubMed, Embase, PsychInfo), however, it is up to the researcher to choose the most appropriate sources for their research area.{{Cite journal |last=Quintana |first=Daniel S. |date=2015-10-08 |title=From pre-registration to publication: a non-technical primer for conducting a meta-analysis to synthesize correlational data |journal=Frontiers in Psychology |volume=6 |page=1549 |doi=10.3389/fpsyg.2015.01549 |issn=1664-1078 |pmc=4597034 |pmid=26500598 |doi-access=free}} Indeed, many scientists use duplicate search terms within two or more databases to cover multiple sources.{{Cite journal |last1=Bramer |first1=Wichor M. |last2=Giustini |first2=Dean |last3=de Jonge |first3=Gerdien B. |last4=Holland |first4=Leslie |last5=Bekhuis |first5=Tanja |date=2016 |title=De-duplication of database search results for systematic reviews in EndNote |journal=Journal of the Medical Library Association |volume=104 |issue=3 |pages=240–243 |doi=10.3163/1536-5050.104.3.014 |issn=1536-5050 |pmc=4915647 |pmid=27366130}} The reference lists of eligible studies can also be searched for eligible studies (i.e., snowballing).{{Cite journal |last1=Horsley |first1=Tanya |last2=Dingwall |first2=Orvie |last3=Sampson |first3=Margaret |date=2011-08-10 |editor-last=Cochrane Methodology Review Group |title=Checking reference lists to find additional studies for systematic reviews |journal=Cochrane Database of Systematic Reviews |language=en |volume=2011 |issue=8 |pages=MR000026 |doi=10.1002/14651858.MR000026.pub2 |pmc=7388740 |pmid=21833989}} The initial search may return a large volume of studies. Quite often, the abstract or the title of the manuscript reveals that the study is not eligible for inclusion, based on the pre-specified criteria. These studies can be discarded. However, if it appears that the study may be eligible (or even if there is some doubt) the full paper can be retained for closer inspection. The references lists of eligible articles can also be searched for any relevant articles.{{Cite journal |last1=Bramer |first1=Wichor M. |last2=De Jonge |first2=Gerdien B. |last3=Rethlefsen |first3=Melissa L. |last4=Mast |first4=Frans |last5=Kleijnen |first5=Jos |date=2018-10-04 |title=A systematic approach to searching: an efficient and complete method to develop literature searches |url=http://jmla.pitt.edu/ojs/jmla/article/view/283 |journal=Journal of the Medical Library Association |volume=106 |issue=4 |pages=531–541 |doi=10.5195/jmla.2018.283 |issn=1558-9439 |pmc=6148622 |pmid=30271302}} These search results need to be detailed in a PRIMSA flow diagram{{Cite journal |last1=Moher |first1=David |last2=Tetzlaff |first2=Jennifer |last3=Tricco |first3=Andrea C |last4=Sampson |first4=Margaret |last5=Altman |first5=Douglas G |date=2007-03-27 |editor-last=Clarke |editor-first=Mike |title=Epidemiology and Reporting Characteristics of Systematic Reviews |journal=PLOS Medicine |language=en |volume=4 |issue=3 |pages=e78 |doi=10.1371/journal.pmed.0040078 |doi-access=free |issn=1549-1676 |pmc=1831728 |pmid=17388659}} which details the flow of information through all stages of the review. Thus, it is important to note how many studies were returned after using the specified search terms and how many of these studies were discarded, and for what reason. The search terms and strategy should be specific enough for a reader to reproduce the search.{{Cite journal |last1=Lakens |first1=Daniël |last2=Hilgard |first2=Joe |last3=Staaks |first3=Janneke |date=2016 |title=On the reproducibility of meta-analyses: six practical recommendations |journal=BMC Psychology |language=en |volume=4 |issue=1 |page=24 |doi=10.1186/s40359-016-0126-3 |doi-access=free |issn=2050-7283 |pmc=4886411 |pmid=27241618}} The date range of studies, along with the date (or date period) the search was conducted should also be provided.{{Cite journal |last1=Nguyen |first1=Phi-Yen |last2=McKenzie |first2=Joanne E. |last3=Hamilton |first3=Daniel G. |last4=Moher |first4=David |last5=Tugwell |first5=Peter |last6=Fidler |first6=Fiona M. |last7=Haddaway |first7=Neal R. |last8=Higgins |first8=Julian P. T. |last9=Kanukula |first9=Raju |last10=Karunananthan |first10=Sathya |last11=Maxwell |first11=Lara J. |last12=McDonald |first12=Steve |last13=Nakagawa |first13=Shinichi |last14=Nunan |first14=David |last15=Welch |first15=Vivian A. |date=2023 |title=Systematic reviewers' perspectives on sharing review data, analytic code, and other materials: A survey |journal=Cochrane Evidence Synthesis and Methods |language=en |volume=1 |issue=2 |doi=10.1002/cesm.12008 |issn=2832-9023|doi-access=free }}

A data collection form provides a standardized means of collecting data from eligible studies.{{Cite journal |last1=Page |first1=Matthew J. |last2=Altman |first2=Douglas G. |last3=Shamseer |first3=Larissa |last4=McKenzie |first4=Joanne E. |last5=Ahmadzai |first5=Nadera |last6=Wolfe |first6=Dianna |last7=Yazdi |first7=Fatemeh |last8=Catalá-López |first8=Ferrán |last9=Tricco |first9=Andrea C. |last10=Moher |first10=David |date=2018 |title=Reproducible research practices are underused in systematic reviews of biomedical interventions |url=https://linkinghub.elsevier.com/retrieve/pii/S0895435617305358 |journal=Journal of Clinical Epidemiology |language=en |volume=94 |pages=8–18 |doi=10.1016/j.jclinepi.2017.10.017|pmid=29113936 }} For a meta-analysis of correlational data, effect size information is usually collected as Pearson's r statistic.{{Cite journal |last=Aloe |first=Ariel M. |date=2014 |title=An Empirical Investigation of Partial Effect Sizes in Meta-Analysis of Correlational Data |url=http://www.tandfonline.com/doi/abs/10.1080/00221309.2013.853021 |journal=The Journal of General Psychology |language=en |volume=141 |issue=1 |pages=47–64 |doi=10.1080/00221309.2013.853021 |pmid=24838020 |issn=0022-1309}}{{Cite journal |last1=Tracz |first1=Susan M. |last2=Elmore |first2=Patricia B. |last3=Pohlmann |first3=John T. |date=1992 |title=Correlational Meta-Analysis: Independent and Nonindependent Cases |url=https://journals.sagepub.com/doi/10.1177/0013164492052004007 |journal=Educational and Psychological Measurement |language=en |volume=52 |issue=4 |pages=879–888 |doi=10.1177/0013164492052004007 |issn=0013-1644}} Partial correlations are often reported in research, however, these may inflate relationships in comparison to zero-order correlations.{{Cite journal |last=Cramer |first=Duncan |date=2003 |title=A Cautionary Tale of Two Statistics: Partial Correlation and Standardized Partial Regression |url=http://www.tandfonline.com/doi/abs/10.1080/00223980309600632 |journal=The Journal of Psychology |language=en |volume=137 |issue=5 |pages=507–511 |doi=10.1080/00223980309600632 |pmid=14629080 |s2cid=37557674 |issn=0022-3980}} Moreover, the partialed out variables will likely vary from study-to-study. As a consequence, many meta-analyses exclude partial correlations from their analysis. As a final resort, plot digitizers can be used to scrape data points from scatterplots (if available) for the calculation of Pearson's r.{{Cite journal |last1=Gross |first1=Arnd |last2=Schirm |first2=Sibylle |last3=Scholz |first3=Markus |date=2014 |title=Ycasd– a tool for capturing and scaling data from graphical representations |journal=BMC Bioinformatics |language=en |volume=15 |issue=1 |page=219 |doi=10.1186/1471-2105-15-219 |doi-access=free |pmid=24965054 |pmc=4085079 |issn=1471-2105}}{{Citation |last1=Cliche |first1=Mathieu |title=Scatteract: Automated Extraction of Data from Scatter Plots |date=2017 |url=https://link.springer.com/10.1007/978-3-319-71249-9_9 |work=Machine Learning and Knowledge Discovery in Databases |volume=10534 |pages=135–150 |editor-last=Ceci |editor-first=Michelangelo |access-date=2023-12-26 |place=Cham |publisher=Springer International Publishing |language=en |doi=10.1007/978-3-319-71249-9_9 |isbn=978-3-319-71248-2 |last2=Rosenberg |first2=David |last3=Madeka |first3=Dhruv |last4=Yee |first4=Connie |arxiv=1704.06687 |s2cid=9543956 |editor2-last=Hollmén |editor2-first=Jaakko |editor3-last=Todorovski |editor3-first=Ljupčo |editor4-last=Vens |editor4-first=Celine}} Data reporting important study characteristics that may moderate effects, such as the mean age of participants, should also be collected.{{Cite journal |last1=Moreau |first1=David |last2=Gamble |first2=Beau |date=2022 |title=Conducting a meta-analysis in the age of open science: Tools, tips, and practical recommendations. |url=http://doi.apa.org/getdoi.cfm?doi=10.1037/met0000351 |journal=Psychological Methods |language=en |volume=27 |issue=3 |pages=426–432 |doi=10.1037/met0000351 |pmid=32914999 |s2cid=221619510 |issn=1939-1463}} A measure of study quality can also be included in these forms to assess the quality of evidence from each study.{{Cite journal |last1=McGuinness |first1=Luke A. |last2=Higgins |first2=Julian P. T. |date=2021 |title=Risk-of-bias VISualization (robvis): An R package and Shiny web app for visualizing risk-of-bias assessments |journal=Research Synthesis Methods |language=en |volume=12 |issue=1 |pages=55–61 |doi=10.1002/jrsm.1411 |pmid=32336025 |issn=1759-2879|doi-access=free |hdl=1983/e59b578e-1534-43d9-a438-8bc27b363a9a |hdl-access=free }} There are more than 80 tools available to assess the quality and risk of bias in observational studies reflecting the diversity of research approaches between fields.{{Cite journal |last1=Sanderson |first1=S. |last2=Tatt |first2=I. D |last3=Higgins |first3=J. P. |date=2007-06-01 |title=Tools for assessing quality and susceptibility to bias in observational studies in epidemiology: a systematic review and annotated bibliography |journal=International Journal of Epidemiology |language=en |volume=36 |issue=3 |pages=666–676 |doi=10.1093/ije/dym018 |issn=0300-5771|doi-access=free |pmid=17470488 }}{{Cite journal |last1=Haddaway |first1=Neal R. |last2=Macura |first2=Biljana |last3=Whaley |first3=Paul |last4=Pullin |first4=Andrew S. |date=2018 |title=ROSES RepOrting standards for Systematic Evidence Syntheses: pro forma, flow-diagram and descriptive summary of the plan and conduct of environmental systematic reviews and systematic maps |journal=Environmental Evidence |language=en |volume=7 |issue=1 |doi=10.1186/s13750-018-0121-7 |doi-access=free |bibcode=2018EnvEv...7....7H |issn=2047-2382}} These tools usually include an assessment of how dependent variables were measured, appropriate selection of participants, and appropriate control for confounding factors. Other quality measures that may be more relevant for correlational studies include sample size, psychometric properties, and reporting of methods.

A final consideration is whether to include studies from the gray literature,{{Cite journal |last=Paez |first=Arsenio |date=2017 |title=Gray literature: An important resource in systematic reviews |url=https://onlinelibrary.wiley.com/doi/10.1111/jebm.12266 |journal=Journal of Evidence-Based Medicine |language=en |volume=10 |issue=3 |pages=233–240 |doi=10.1111/jebm.12266 |pmid=28857505 |issn=1756-5383}} which is defined as research that has not been formally published.{{Cite journal |last1=Conn |first1=Vicki S. |last2=Valentine |first2=Jeffrey C. |last3=Cooper |first3=Harris M. |last4=Rantz |first4=Marilyn J. |date=2003 |title=Grey Literature in Meta-Analyses |url=http://journals.lww.com/00006199-200307000-00008 |journal=Nursing Research |language=en |volume=52 |issue=4 |pages=256–261 |doi=10.1097/00006199-200307000-00008 |pmid=12867783 |s2cid=27109643 |issn=0029-6562}} This type of literature includes conference abstracts,{{Cite journal |last1=Scherer |first1=Roberta W. |last2=Saldanha |first2=Ian J. |date=2019 |title=How should systematic reviewers handle conference abstracts? A view from the trenches |journal=Systematic Reviews |language=en |volume=8 |issue=1 |page=264 |doi=10.1186/s13643-019-1188-0 |doi-access=free |issn=2046-4053 |pmc=6836535 |pmid=31699124}} dissertations,{{Cite journal |last1=Hartling |first1=Lisa |last2=Featherstone |first2=Robin |last3=Nuspl |first3=Megan |last4=Shave |first4=Kassi |last5=Dryden |first5=Donna M. |last6=Vandermeer |first6=Ben |date=2017 |title=Grey literature in systematic reviews: a cross-sectional study of the contribution of non-English reports, unpublished studies and dissertations to the results of meta-analyses in child-relevant reviews |journal=BMC Medical Research Methodology |language=en |volume=17 |issue=1 |page=64 |doi=10.1186/s12874-017-0347-z |doi-access=free |issn=1471-2288 |pmc=5395863 |pmid=28420349}} and pre-prints.{{Cite journal |last1=Haddaway |first1=N.R. |last2=Woodcock |first2=P. |last3=Macura |first3=B. |last4=Collins |first4=A. |date=2015 |title=Making literature reviews more reliable through application of lessons from systematic reviews |url=https://www.jstor.org/stable/24761072 |journal=Conservation Biology |volume=29 |issue=6 |pages=1596–1605 |doi=10.1111/cobi.12541 |jstor=24761072 |pmid=26032263 |bibcode=2015ConBi..29.1596H |s2cid=20624428 |issn=0888-8892}} While the inclusion of gray literature reduces the risk of publication bias, the methodological quality of the work is often (but not always) lower than formally published work.{{Cite journal |last1=Egger |first1=M |last2=Jüni |first2=P |last3=Bartlett |first3=C |last4=Holenstein |first4=F |last5=Sterne |first5=J |date=2003 |title=How important are comprehensive literature searches and the assessment of trial quality in systematic reviews? Empirical study |journal=Health Technology Assessment |volume=7 |issue=1 |pages=1–82 |doi=10.3310/hta7010 |issn=1366-5278|doi-access=free |pmid=12583822 }}{{Citation |last1=Lefebvre |first1=Carol |title=Searching for and selecting studies |date=2019-09-23 |url=https://onlinelibrary.wiley.com/doi/10.1002/9781119536604.ch4 |work=Cochrane Handbook for Systematic Reviews of Interventions |pages=67–107 |editor-last=Higgins |editor-first=Julian P.T. |access-date=2023-12-26 |edition=1 |publisher=Wiley |language=en |doi=10.1002/9781119536604.ch4 |isbn=978-1-119-53662-8 |last2=Glanville |first2=Julie |last3=Briscoe |first3=Simon |last4=Littlewood |first4=Anne |last5=Marshall |first5=Chris |last6=Metzendorf |first6=Maria-Inti |last7=Noel-Storr |first7=Anna |last8=Rader |first8=Tamara |last9=Shokraneh |first9=Farhad |s2cid=204603849 |editor2-last=Thomas |editor2-first=James |editor3-last=Chandler |editor3-first=Jacqueline |editor4-last=Cumpston |editor4-first=Miranda}} Reports from conference proceedings, which are the most common source of gray literature,{{Cite journal |last1=McAuley |first1=Laura |last2=Pham |first2=Ba' |last3=Tugwell |first3=Peter |last4=Moher |first4=David |date=2000 |title=Does the inclusion of grey literature influence estimates of intervention effectiveness reported in meta-analyses? |url=https://linkinghub.elsevier.com/retrieve/pii/S0140673600027860 |journal=The Lancet |language=en |volume=356 |issue=9237 |pages=1228–1231 |doi=10.1016/S0140-6736(00)02786-0|pmid=11072941 |s2cid=33777183 }} are poorly reported{{Cite journal |last1=Hopewell |first1=Sally |last2=Clarke |first2=Mike |date=2005 |title=Abstracts presented at the American Society of Clinical Oncology conference: how completely are trials reported? |url=http://dx.doi.org/10.1191/1740774505cn091oa |journal=Clinical Trials |volume=2 |issue=3 |pages=265–268 |doi=10.1191/1740774505cn091oa |pmid=16279150 |s2cid=3601317 |issn=1740-7745}} and data in the subsequent publication is often inconsistent, with differences observed in almost 20% of published studies.{{Cite journal |last1=Bhandari |first1=Mohit |last2=Devereaux |first2=P. J. |last3=Guyatt |first3=Gordon H. |last4=Cook |first4=Deborah J. |last5=Swiontkowski |first5=Marc F. |last6=Sprague |first6=Sheila |last7=Schemitsch |first7=Emil H. |date=2002 |title=An Observational Study of Orthopaedic Abstracts and Subsequent Full-Text Publications |url=http://dx.doi.org/10.2106/00004623-200204000-00017 |journal=The Journal of Bone and Joint Surgery-American Volume |volume=84 |issue=4 |pages=615–621 |doi=10.2106/00004623-200204000-00017 |pmid=11940624 |s2cid=8807106 |issn=0021-9355}}

Methods and assumptions

=Approaches=

In general, two types of evidence can be distinguished when performing a meta-analysis: individual participant data (IPD), and aggregate data (AD).{{Cite journal |last1=Tierney |first1=Jayne F. |last2=Fisher |first2=David J. |last3=Burdett |first3=Sarah |last4=Stewart |first4=Lesley A. |last5=Parmar |first5=Mahesh K. B. |date=2020-01-31 |editor-last=Shapiro |editor-first=Steven D. |title=Comparison of aggregate and individual participant data approaches to meta-analysis of randomised trials: An observational study |journal=PLOS Medicine |language=en |volume=17 |issue=1 |pages=e1003019 |doi=10.1371/journal.pmed.1003019 |doi-access=free |issn=1549-1676 |pmc=6993967 |pmid=32004320}} The aggregate data can be direct or indirect.

AD is more commonly available (e.g. from the literature) and typically represents summary estimates such as odds ratios{{Cite journal |last1=Chang |first1=Bei-Hung |last2=Hoaglin |first2=David C. |date=2017 |title=Meta-Analysis of Odds Ratios: Current Good Practices |journal=Medical Care |language=en |volume=55 |issue=4 |pages=328–335 |doi=10.1097/MLR.0000000000000696 |issn=0025-7079 |pmc=5352535 |pmid=28169977}} or relative risks.{{Cite journal |last1=Bakbergenuly |first1=Ilyas |last2=Hoaglin |first2=David C. |last3=Kulinskaya |first3=Elena |date=2019 |title=Pitfalls of using the risk ratio in meta-analysis |journal=Research Synthesis Methods |language=en |volume=10 |issue=3 |pages=398–419 |doi=10.1002/jrsm.1347 |issn=1759-2879 |pmc=6767076 |pmid=30854785}} This can be directly synthesized across conceptually similar studies using several approaches. On the other hand, indirect aggregate data measures the effect of two treatments that were each compared against a similar control group in a meta-analysis. For example, if treatment A and treatment B were directly compared vs placebo in separate meta-analyses, we can use these two pooled results to get an estimate of the effects of A vs B in an indirect comparison as effect A vs Placebo minus effect B vs Placebo.

IPD evidence represents raw data as collected by the study centers. This distinction has raised the need for different meta-analytic methods when evidence synthesis is desired, and has led to the development of one-stage and two-stage methods.{{cite journal | vauthors = Debray TP, Moons KG, van Valkenhoef G, Efthimiou O, Hummel N, Groenwold RH, Reitsma JB | title = Get real in individual participant data (IPD) meta-analysis: a review of the methodology | journal = Research Synthesis Methods | volume = 6 | issue = 4 | pages = 293–309 | date = December 2015 | pmid = 26287812 | pmc = 5042043 | doi = 10.1002/jrsm.1160 }} In one-stage methods the IPD from all studies are modeled simultaneously whilst accounting for the clustering of participants within studies. Two-stage methods first compute summary statistics for AD from each study and then calculate overall statistics as a weighted average of the study statistics. By reducing IPD to AD, two-stage methods can also be applied when IPD is available; this makes them an appealing choice when performing a meta-analysis. Although it is conventionally believed that one-stage and two-stage methods yield similar results, recent studies have shown that they may occasionally lead to different conclusions.{{cite journal | vauthors = Debray TP, Moons KG, Abo-Zaid GM, Koffijberg H, Riley RD | title = Individual participant data meta-analysis for a binary outcome: one-stage or two-stage? | journal = PLOS ONE | volume = 8 | issue = 4 | pages = e60650 | year = 2013 | pmid = 23585842 | pmc = 3621872 | doi = 10.1371/journal.pone.0060650 | doi-access = free | bibcode = 2013PLoSO...860650D }}{{cite journal | vauthors = Burke DL, Ensor J, Riley RD | title = Meta-analysis using individual participant data: one-stage and two-stage approaches, and why they may differ | journal = Statistics in Medicine | volume = 36 | issue = 5 | pages = 855–875 | date = February 2017 | pmid = 27747915 | pmc = 5297998 | doi = 10.1002/sim.7141 }}

=Statistical models for aggregate data=

== Fixed effect model ==

File:Generic forest plot.png

The fixed effect model provides a weighted average of a series of study estimates.{{Cite journal |last1=Nikolakopoulou |first1=Adriani |last2=Mavridis |first2=Dimitris |last3=Salanti |first3=Georgia |date=2014 |title=How to interpret meta-analysis models: fixed effect and random effects meta-analyses |url=https://ebmh.bmj.com/lookup/doi/10.1136/eb-2014-101794 |journal=Evidence Based Mental Health |language=en |volume=17 |issue=2 |pages=64 |doi=10.1136/eb-2014-101794 |pmid=24778439 |issn=1362-0347}} The inverse of the estimates' variance is commonly used as study weight, so that larger studies tend to contribute more than smaller studies to the weighted average.{{Cite journal |last=Dekkers |first=Olaf M. |date=2018 |title=Meta-analysis: Key features, potentials and misunderstandings |journal=Research and Practice in Thrombosis and Haemostasis |language=en |volume=2 |issue=4 |pages=658–663 |doi=10.1002/rth2.12153 |pmc=6178740 |pmid=30349883}} Consequently, when studies within a meta-analysis are dominated by a very large study, the findings from smaller studies are practically ignored.{{cite journal | vauthors = Helfenstein U | title = Data and models determine treatment proposals--an illustration from meta-analysis | journal = Postgraduate Medical Journal | volume = 78 | issue = 917 | pages = 131–134 | date = March 2002 | pmid = 11884693 | pmc = 1742301 | doi = 10.1136/pmj.78.917.131 }} Most importantly, the fixed effects model assumes that all included studies investigate the same population, use the same variable and outcome definitions, etc.{{Cite journal |last1=Dettori |first1=Joseph R. |last2=Norvell |first2=Daniel C. |last3=Chapman |first3=Jens R. |date=2022 |title=Fixed-Effect vs Random-Effects Models for Meta-Analysis: 3 Points to Consider |journal=Global Spine Journal |language=en |volume=12 |issue=7 |pages=1624–1626 |doi=10.1177/21925682221110527 |issn=2192-5682 |pmc=9393987 |pmid=35723546}} This assumption is typically unrealistic as research is often prone to several sources of heterogeneity.{{Cite journal |last1=Hedges |first1=Larry V. |last2=Vevea |first2=Jack L. |date=1998 |title=Fixed- and random-effects models in meta-analysis. |url=http://doi.apa.org/getdoi.cfm?doi=10.1037/1082-989X.3.4.486 |journal=Psychological Methods |language=en |volume=3 |issue=4 |pages=486–504 |doi=10.1037/1082-989X.3.4.486 |s2cid=119814256 |issn=1939-1463}}{{Cite journal |last1=Rice |first1=Kenneth |last2=Higgins |first2=Julian P. T. |last3=Lumley |first3=Thomas |date=2018 |title=A re-evaluation of fixed effect(s) meta-analysis |url=https://www.jstor.org/stable/44682165 |journal=Journal of the Royal Statistical Society. Series A (Statistics in Society) |volume=181 |issue=1 |pages=205–227 |doi=10.1111/rssa.12275 |jstor=44682165 |issn=0964-1998}}

If we start with a collection of independent effect size estimates, each estimate a corresponding effect size $i = 1,\ldots,k$ we can assume that $y_i = \theta_i + e_i$ where $y_i$ denotes the observed effect in the $i$ -th study, $\theta_i$ the corresponding (unknown) true effect, $e_i$ is the sampling error, and $e_i \thicksim N(0, v_i)$ . Therefore, the $y_i$ ’s are assumed to be unbiased and normally distributed estimates of their corresponding true effects. The sampling variances (i.e., $v_i$ values) are assumed to be known.

== Random effects model ==

Most meta-analyses are based on sets of studies that are not exactly identical in their methods and/or the characteristics of the included samples. Differences in the methods and sample characteristics may introduce variability (“heterogeneity”) among the true effects.{{Cite journal |last1=Holzmeister |first1=Felix |last2=Johannesson |first2=Magnus |last3=Böhm |first3=Robert |last4=Dreber |first4=Anna |last5=Huber |first5=Jürgen |last6=Kirchler |first6=Michael |date=2024-08-06 |title=Heterogeneity in effect size estimates |journal=Proceedings of the National Academy of Sciences |language=en |volume=121 |issue=32 |pages=e2403490121 |doi=10.1073/pnas.2403490121 |issn=0027-8424 |pmc=11317577 |pmid=39078672|bibcode=2024PNAS..12103490H }} One way to model the heterogeneity is to treat it as purely random. The weight that is applied in this process of weighted averaging with a random effects meta-analysis is achieved in two steps:{{cite journal | vauthors = Senn S | title = Trying to be precise about vagueness | journal = Statistics in Medicine | volume = 26 | issue = 7 | pages = 1417–1430 | date = March 2007 | pmid = 16906552 | doi = 10.1002/sim.2639 | s2cid = 17764847 | doi-access = free }}

Step 1: Inverse variance weighting
Step 2: Un-weighting of this inverse variance weighting by applying a random effects variance component (REVC) that is simply derived from the extent of variability of the effect sizes of the underlying studies.

This means that the greater this variability in effect sizes (otherwise known as heterogeneity), the greater the un-weighting and this can reach a point when the random effects meta-analysis result becomes simply the un-weighted average effect size across the studies. At the other extreme, when all effect sizes are similar (or variability does not exceed sampling error), no REVC is applied and the random effects meta-analysis defaults to simply a fixed effect meta-analysis (only inverse variance weighting).

The extent of this reversal is solely dependent on two factors:{{cite journal | vauthors = Al Khalaf MM, Thalib L, Doi SA | title = Combining heterogenous studies using the random-effects model is a mistake and leads to inconclusive meta-analyses | journal = Journal of Clinical Epidemiology | volume = 64 | issue = 2 | pages = 119–123 | date = February 2011 | pmid = 20409685 | doi = 10.1016/j.jclinepi.2010.01.009 }}

Heterogeneity of precision
Heterogeneity of effect size

Since neither of these factors automatically indicates a faulty larger study or more reliable smaller studies, the re-distribution of weights under this model will not bear a relationship to what these studies actually might offer. Indeed, it has been demonstrated that redistribution of weights is simply in one direction from larger to smaller studies as heterogeneity increases until eventually all studies have equal weight and no more redistribution is possible.

Another issue with the random effects model is that the most commonly used confidence intervals generally do not retain their coverage probability above the specified nominal level and thus substantially underestimate the statistical error and are potentially

overconfident in their conclusions.{{cite journal | vauthors = Brockwell SE, Gordon IR | title = A comparison of statistical methods for meta-analysis | journal = Statistics in Medicine | volume = 20 | issue = 6 | pages = 825–840 | date = March 2001 | pmid = 11252006 | doi = 10.1002/sim.650 | s2cid = 16932514 }}{{cite journal | vauthors = Noma H | title = Confidence intervals for a random-effects meta-analysis based on Bartlett-type corrections | journal = Statistics in Medicine | volume = 30 | issue = 28 | pages = 3304–3312 | date = December 2011 | pmid = 21964669 | doi = 10.1002/sim.4350 | hdl-access = free | hdl = 2433/152046 | s2cid = 6556986 }} Several fixes have been suggested{{cite journal | vauthors = Brockwell SE, Gordon IR | title = A simple method for inference on an overall effect in meta-analysis | journal = Statistics in Medicine | volume = 26 | issue = 25 | pages = 4531–4543 | date = November 2007 | pmid = 17397112 | doi = 10.1002/sim.2883 | s2cid = 887098 }}{{cite journal | vauthors = Sidik K, Jonkman JN | title = A simple confidence interval for meta-analysis | journal = Statistics in Medicine | volume = 21 | issue = 21 | pages = 3153–3159 | date = November 2002 | pmid = 12375296 | doi = 10.1002/sim.1262 | s2cid = 21384942 }} but the debate continues on.{{cite journal | vauthors = Jackson D, Bowden J | title = A re-evaluation of the 'quantile approximation method' for random effects meta-analysis | journal = Statistics in Medicine | volume = 28 | issue = 2 | pages = 338–348 | date = January 2009 | pmid = 19016302 | pmc = 2991773 | doi = 10.1002/sim.3487 }} A further concern is that the average treatment effect can sometimes be even less conservative compared to the fixed effect model{{cite journal | vauthors = Poole C, Greenland S | title = Random-effects meta-analyses are not always conservative | journal = American Journal of Epidemiology | volume = 150 | issue = 5 | pages = 469–475 | date = September 1999 | pmid = 10472946 | doi = 10.1093/oxfordjournals.aje.a010035 | doi-access = free }} and therefore misleading in practice. One interpretational fix that has been suggested is to create a prediction interval around the random effects estimate to portray the range of possible effects in practice.{{cite journal | vauthors = Riley RD, Higgins JP, Deeks JJ | title = Interpretation of random effects meta-analyses | journal = BMJ | volume = 342 | pages = d549 | date = February 2011 | pmid = 21310794 | doi = 10.1136/bmj.d549 | s2cid = 32994689 }} However, an assumption behind the calculation of such a prediction interval is that trials are considered more or less homogeneous entities and that included patient populations and comparator treatments should be considered exchangeable{{cite journal | vauthors = Kriston L | title = Dealing with clinical heterogeneity in meta-analysis. Assumptions, methods, interpretation | journal = International Journal of Methods in Psychiatric Research | volume = 22 | issue = 1 | pages = 1–15 | date = March 2013 | pmid = 23494781 | pmc = 6878481 | doi = 10.1002/mpr.1377 }} and this is usually unattainable in practice.

There are many methods used to estimate between studies variance with restricted maximum likelihood estimator being the least prone to bias and one of the most commonly used.{{Cite journal |last1=Langan |first1=Dean |last2=Higgins |first2=Julian P.T. |last3=Jackson |first3=Dan |last4=Bowden |first4=Jack |last5=Veroniki |first5=Areti Angeliki |last6=Kontopantelis |first6=Evangelos |last7=Viechtbauer |first7=Wolfgang |last8=Simmonds |first8=Mark |date=2019 |title=A comparison of heterogeneity variance estimators in simulated random-effects meta-analyses |journal=Research Synthesis Methods |language=en |volume=10 |issue=1 |pages=83–98 |doi=10.1002/jrsm.1316 |pmid=30067315 |s2cid=51890354 |issn=1759-2879|doi-access=free |hdl=1983/c911791c-c687-4f12-bc0b-ffdbe42ca874 |hdl-access=free }} Several advanced iterative techniques for computing the between studies variance exist including both maximum likelihood and restricted maximum likelihood methods and random effects models using these methods can be run with multiple software platforms including Excel,{{cite web |title=MetaXL User Guide |url=http://www.epigear.com/index_files/MetaXL%20User%20Guide.pdf |access-date=2018-09-18}} Stata,{{cite journal|url=https://www.researchgate.net/publication/227629391|title=Metaan: Random-effects meta-analysis| vauthors = Kontopantelis E, Reeves D |date=1 August 2010|journal=Stata Journal|volume=10|issue=3|pages=395–407|via=ResearchGate |doi= 10.1177/1536867X1001000307 |doi-access=free}} SPSS,{{Cite journal |last1=Field |first1=Andy P. |last2=Gillett |first2=Raphael |date=2010 |title=How to do a meta-analysis |url=http://doi.wiley.com/10.1348/000711010X502733 |journal=British Journal of Mathematical and Statistical Psychology |language=en |volume=63 |issue=3 |pages=665–694 |doi=10.1348/000711010X502733|pmid=20497626 |s2cid=22688261 }} and R.{{Cite journal |last=Viechtbauer |first=Wolfgang |date=2010 |title=Conducting Meta-Analyses in R with the metafor Package |url=http://www.jstatsoft.org/v36/i03/ |journal=Journal of Statistical Software |language=en |volume=36 |issue=3 |doi=10.18637/jss.v036.i03 |s2cid=15798713 |issn=1548-7660|doi-access=free }}

Most meta-analyses include between 2 and 4 studies and such a sample is more often than not inadequate to accurately estimate heterogeneity. Thus it appears that in small meta-analyses, an incorrect zero between study variance estimate is obtained, leading to a false homogeneity assumption. Overall, it appears that heterogeneity is being consistently underestimated in meta-analyses and sensitivity analyses in which high heterogeneity levels are assumed could be informative.{{cite journal | vauthors = Kontopantelis E, Springate DA, Reeves D | title = A re-analysis of the Cochrane Library data: the dangers of unobserved heterogeneity in meta-analyses | journal = PLOS ONE | volume = 8 | issue = 7 | pages = e69930 | year = 2013 | pmid = 23922860 | pmc = 3724681 | doi = 10.1371/journal.pone.0069930 | veditors = Friede T | doi-access = free | bibcode = 2013PLoSO...869930K }} These random effects models and software packages mentioned above relate to study-aggregate meta-analyses and researchers wishing to conduct individual patient data (IPD) meta-analyses need to consider mixed-effects modelling approaches.{{cite journal |url= https://www.researchgate.net/publication/257316967 |title=A short guide and a forest plot command (ipdforest) for one-stage meta-analysis| vauthors = Kontopantelis E, Reeves D |date=27 September 2013|journal=Stata Journal|volume=13|issue=3|pages=574–587 |via= ResearchGate |doi=10.1177/1536867X1301300308 |doi-access=free}}/

== Quality effects model ==

Doi and Thalib originally introduced the quality effects model.{{cite journal | vauthors = Doi SA, Thalib L | title = A quality-effects model for meta-analysis | journal = Epidemiology | volume = 19 | issue = 1 | pages = 94–100 | date = January 2008 | pmid = 18090860 | doi = 10.1097/EDE.0b013e31815c24e7 | s2cid = 29723291 | doi-access = free }} They{{cite journal | vauthors = Doi SA, Barendregt JJ, Mozurkewich EL | title = Meta-analysis of heterogeneous clinical trials: an empirical example | journal = Contemporary Clinical Trials | volume = 32 | issue = 2 | pages = 288–298 | date = March 2011 | pmid = 21147265 | doi = 10.1016/j.cct.2010.12.006 }} introduced a new approach to adjustment for inter-study variability by incorporating the contribution of variance due to a relevant component (quality) in addition to the contribution of variance due to random error that is used in any fixed effects meta-analysis model to generate weights for each study. The strength of the quality effects meta-analysis is that it allows available methodological evidence to be used over subjective random effects, and thereby helps to close the damaging gap which has opened up between methodology and statistics in clinical research. To do this a synthetic bias variance is computed based on quality information to adjust inverse variance weights and the quality adjusted weight of the ith study is introduced. These adjusted weights are then used in meta-analysis. In other words, if study i is of good quality and other studies are of poor quality, a proportion of their quality adjusted weights is mathematically redistributed to study i giving it more weight towards the overall effect size. As studies become increasingly similar in terms of quality, re-distribution becomes progressively less and ceases when all studies are of equal quality (in the case of equal quality, the quality effects model defaults to the IVhet model – see previous section). A recent evaluation of the quality effects model (with some updates) demonstrates that despite the subjectivity of quality assessment, the performance (MSE and true variance under simulation) is superior to that achievable with the random effects model.{{cite journal | vauthors = Doi SA, Barendregt JJ, Khan S, Thalib L, Williams GM | title = Simulation Comparison of the Quality Effects and Random Effects Methods of Meta-analysis | journal = Epidemiology | volume = 26 | issue = 4 | pages = e42–e44 | date = July 2015 | pmid = 25872162 | doi = 10.1097/EDE.0000000000000289 | doi-access = free }}{{cite journal | vauthors = Doi SA, Barendregt JJ, Khan S, Thalib L, Williams GM | title = Advances in the meta-analysis of heterogeneous clinical trials II: The quality effects model | journal = Contemporary Clinical Trials | volume = 45 | issue = Pt A | pages = 123–129 | date = November 2015 | pmid = 26003432 | doi = 10.1016/j.cct.2015.05.010 }} This model thus replaces the untenable interpretations that abound in the literature and a software is available to explore this method further.{{cite web|url=http://www.epigear.com/ |title=MetaXL software page |publisher=Epigear.com |date=2017-06-03 |access-date=2018-09-18}}

== Network meta-analysis methods ==

File:Indirekt jämförelse.jpg

Indirect comparison meta-analysis methods (also called network meta-analyses, in particular when multiple treatments are assessed simultaneously) generally use two main methodologies.{{Cite journal |last1=Rouse |first1=Benjamin |last2=Chaimani |first2=Anna |last3=Li |first3=Tianjing |date=2017 |title=Network meta-analysis: an introduction for clinicians |journal=Internal and Emergency Medicine |language=en |volume=12 |issue=1 |pages=103–111 |doi=10.1007/s11739-016-1583-7 |issn=1828-0447 |pmc=5247317 |pmid=27913917}}{{Cite journal |last1=Phillips |first1=Mark R. |last2=Steel |first2=David H. |last3=Wykoff |first3=Charles C. |last4=Busse |first4=Jason W. |last5=Bannuru |first5=Raveendhara R. |last6=Thabane |first6=Lehana |last7=Bhandari |first7=Mohit |last8=Chaudhary |first8=Varun |last9=for the Retina Evidence Trials InterNational Alliance (R.E.T.I.N.A.) Study Group |last10=Sivaprasad |first10=Sobha |last11=Kaiser |first11=Peter |last12=Sarraf |first12=David |last13=Bakri |first13=Sophie J. |last14=Garg |first14=Sunir J. |last15=Singh |first15=Rishi P. |date=2022 |title=A clinician's guide to network meta-analysis |journal=Eye |language=en |volume=36 |issue=8 |pages=1523–1526 |doi=10.1038/s41433-022-01943-5 |issn=0950-222X |pmc=9307840 |pmid=35145277}} First, is the Bucher method{{cite journal | vauthors = Bucher HC, Guyatt GH, Griffith LE, Walter SD | title = The results of direct and indirect treatment comparisons in meta-analysis of randomized controlled trials | journal = Journal of Clinical Epidemiology | volume = 50 | issue = 6 | pages = 683–691 | date = June 1997 | pmid = 9250266 | doi = 10.1016/s0895-4356(97)00049-8 }} which is a single or repeated comparison of a closed loop of three-treatments such that one of them is common to the two studies and forms the node where the loop begins and ends. Therefore, multiple two-by-two comparisons (3-treatment loops) are needed to compare multiple treatments. This methodology requires that trials with more than two arms have two arms only selected as independent pair-wise comparisons are required. The alternative methodology uses complex statistical modelling to include the multiple arm trials and comparisons simultaneously between all competing treatments. These have been executed using Bayesian methods, mixed linear models and meta-regression approaches.{{citation needed|date=June 2018}}

===Bayesian framework===

Specifying a Bayesian network meta-analysis model involves writing a directed acyclic graph (DAG) model for general-purpose Markov chain Monte Carlo (MCMC) software such as WinBUGS.{{cite journal | vauthors = van Valkenhoef G, Lu G, de Brock B, Hillege H, Ades AE, Welton NJ | title = Automating network meta-analysis | journal = Research Synthesis Methods | volume = 3 | issue = 4 | pages = 285–299 | date = December 2012 | pmid = 26053422 | doi = 10.1002/jrsm.1054 | s2cid = 33613631 }} In addition, prior distributions have to be specified for a number of the parameters, and the data have to be supplied in a specific format. Together, the DAG, priors, and data form a Bayesian hierarchical model. To complicate matters further, because of the nature of MCMC estimation, overdispersed starting values have to be chosen for a number of independent chains so that convergence can be assessed.{{cite journal |vauthors=Brooks SP, Gelman A | year = 1998 | title = General methods for monitoring convergence of iterative simulations | url = http://www.stat.columbia.edu/~gelman/research/published/brooksgelman2.pdf| journal = Journal of Computational and Graphical Statistics | volume = 7 | issue = 4| pages = 434–455 | doi=10.1080/10618600.1998.10474787| s2cid = 7300890 }} Recently, multiple R software packages were developed to simplify the model fitting (e.g., metaBMA{{Cite web | vauthors = Heck DW, Gronau QF, Wagenmakers EJ, Patil I |title=metaBMA: Bayesian model averaging for random and fixed effects meta-analysis |url=https://CRAN.R-project.org/package=metaBMA |access-date=9 May 2022 |website=CRAN|date=17 March 2021 }} and RoBMA{{Cite web | vauthors = Bartoš F, Maier M, Wagenmakers EJ, Goosen J, Denwood M, Plummer M |title=RoBMA: An R Package for Robust Bayesian Meta-Analyses |date=20 April 2022 |url=https://CRAN.R-project.org/package=RoBMA |access-date=9 May 2022}}) and even implemented in statistical software with graphical user interface (GUI): JASP. Although the complexity of the Bayesian approach limits usage of this methodology, recent tutorial papers are trying to increase accessibility of the methods.{{Cite journal | vauthors = Gronau QF, Heck DW, Berkhout SW, Haaf JM, Wagenmakers EJ |date=July 2021 |title=A Primer on Bayesian Model-Averaged Meta-Analysis |journal=Advances in Methods and Practices in Psychological Science |language=en |volume=4 |issue=3 |pages= |doi=10.1177/25152459211031256 |s2cid=237699937 |issn=2515-2459|doi-access=free |hdl=11245.1/ec2c07d1-5ff0-431b-b53a-10f9c5d9541d |hdl-access=free }}{{Cite journal | vauthors = Bartoš F, Maier M, Quintana D, Wagenmakers EJ |date=2020-10-16 |title=Adjusting for Publication Bias in JASP & R - Selection Models, PET-PEESE, and Robust Bayesian Meta-Analysis | journal = Advances in Methods and Practices in Psychological Science |url=https://osf.io/75bqn |doi=10.31234/osf.io/75bqn |s2cid=236826939 |doi-access=free |hdl=11245.1/5540e87c-0883-45e6-87de-48d2bf4c1e1d |hdl-access=free }} Methodology for automation of this method has been suggested but requires that arm-level outcome data are available, and this is usually unavailable. Great claims are sometimes made for the inherent ability of the Bayesian framework to handle network meta-analysis and its greater flexibility. However, this choice of implementation of framework for inference, Bayesian or frequentist, may be less important than other choices regarding the modeling of effects{{cite journal | vauthors = Senn S, Gavini F, Magrez D, Scheen A | title = Issues in performing a network meta-analysis | journal = Statistical Methods in Medical Research | volume = 22 | issue = 2 | pages = 169–189 | date = April 2013 | pmid = 22218368 | doi = 10.1177/0962280211432220 | s2cid = 10860031 }} (see discussion on models above).

===Frequentist multivariate framework===

On the other hand, the frequentist multivariate methods involve approximations and assumptions that are not stated explicitly or verified when the methods are applied (see discussion on meta-analysis models above). For example, the mvmeta package for Stata enables network meta-analysis in a frequentist framework.{{cite journal | vauthors = White IR | year = 2011 | title = Multivariate random-effects meta-regression: updates to mvmeta | journal = The Stata Journal | volume = 11 | issue = 2| pages = 255–270| doi = 10.1177/1536867X1101100206 | doi-access = free }} However, if there is no common comparator in the network, then this has to be handled by augmenting the dataset with fictional arms with high variance, which is not very objective and requires a decision as to what constitutes a sufficiently high variance. The other issue is use of the random effects model in both this frequentist framework and the Bayesian framework. Senn advises analysts to be cautious about interpreting the 'random effects' analysis since only one random effect is allowed for but one could envisage many. Senn goes on to say that it is rather naıve, even in the case where only two treatments are being compared to assume that random-effects analysis accounts for all uncertainty about the way effects can vary from trial to trial. Newer models of meta-analysis such as those discussed above would certainly help alleviate this situation and have been implemented in the next framework.

===Generalized pairwise modelling framework===

An approach that has been tried since the late 1990s is the implementation of the multiple three-treatment closed-loop analysis. This has not been popular because the process rapidly becomes overwhelming as network complexity increases. Development in this area was then abandoned in favor of the Bayesian and multivariate frequentist methods which emerged as alternatives. Very recently, automation of the three-treatment closed loop method has been developed for complex networks by some researchers as a way to make this methodology available to the mainstream research community. This proposal does restrict each trial to two interventions, but also introduces a workaround for multiple arm trials: a different fixed control node can be selected in different runs. It also utilizes robust meta-analysis methods so that many of the problems highlighted above are avoided. Further research around this framework is required to determine if this is indeed superior to the Bayesian or multivariate frequentist frameworks. Researchers willing to try this out have access to this framework through a free software.

==Diagnostic test accuracy meta-analysis==

Diagnostic test accuracy (DTA) meta-analyses differ methodologically from those assessing intervention effects, as they aim to jointly synthesize pairs of sensitivity and specificity values. These parameters are typically analyzed using hierarchical models that account for the correlation between them and between-study heterogeneity. Two commonly used models are the bivariate random-effects model and the hierarchical summary receiver operating characteristic (HSROC) model. These approaches are recommended by the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy and are widely used in reviews of screening tests, imaging tools, and laboratory diagnostics.Reitsma JB, Glas AS, Rutjes AWS, Scholten RJPM, Bossuyt PMM, Zwinderman AH (2005). Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. J Clin Epidemiol. 58(10):982–990. doi:[https://doi.org/10.1016/j.jclinepi.2005.02.022 10.1016/j.jclinepi.2005.02.022]Rutter CM, Gatsonis CA (2001). A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Stat Med. 20(19):2865–84. doi:[https://doi.org/10.1002/sim.942 10.1002/sim.942]McInnes MDF, Moher D, Thombs BD, et al. Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: The PRISMA-DTA Statement. JAMA. 2018;319(4):388–396. doi:[https://doi.org/10.1001/jama.2017.19163 10.1001/jama.2017.19163]

Beyond the standard hierarchical models, other approaches have been developed to address various complexities in diagnostic accuracy synthesis. These include methods that incorporate differences in threshold effects, account for covariates through meta-regression, or improve applicability by considering test setting and clinical variation. Some frameworks aim to adapt the synthesis to reflect intended use conditions more directly. These extensions are part of an evolving body of methodology that reflects growing experience in the field and increasing demands from clinical and policy decision-makers.Deeks JJ, Takwoingi Y. Two decades of progress in test accuracy systematic reviews: Managing meta-analytical complexity. J Clin Epidemiol. 2020;122:92–102. doi:[https://doi.org/10.1016/j.jclinepi.2020.03.003 10.1016/j.jclinepi.2020.03.003]

== Aggregating IPD and AD ==

Meta-analysis can also be applied to combine IPD and AD. This is convenient when the researchers who conduct the analysis have their own raw data while collecting aggregate or summary data from the literature. The generalized integration model (GIM) is a generalization of the meta-analysis. It allows that the model fitted on the individual participant data (IPD) is different from the ones used to compute the aggregate data (AD). GIM can be viewed as a model calibration method for integrating information with more flexibility.

=Validation of meta-analysis results=

The meta-analysis estimate represents a weighted average across studies and when there is heterogeneity this may result in the summary estimate not being representative of individual studies. Qualitative appraisal of the primary studies using established tools can uncover potential biases,{{cite journal | vauthors = Higgins JP, Altman DG, Gøtzsche PC, Jüni P, Moher D, Oxman AD, Savovic J, Schulz KF, Weeks L, Sterne JA | display-authors = 6 | title = The Cochrane Collaboration's tool for assessing risk of bias in randomised trials | journal = BMJ | volume = 343 | pages = d5928 | date = October 2011 | pmid = 22008217 | pmc = 3196245 | doi = 10.1136/bmj.d5928 }}{{cite journal | vauthors = Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, Leeflang MM, Sterne JA, Bossuyt PM | display-authors = 6 | title = QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies | journal = Annals of Internal Medicine | volume = 155 | issue = 8 | pages = 529–536 | date = October 2011 | pmid = 22007046 | doi = 10.7326/0003-4819-155-8-201110180-00009 | doi-access = free }} but does not quantify the aggregate effect of these biases on the summary estimate. Although the meta-analysis result could be compared with an independent prospective primary study, such external validation is often impractical. This has led to the development of methods that exploit a form of leave-one-out cross validation, sometimes referred to as internal-external cross validation (IOCV).{{cite journal | vauthors = Royston P, Parmar MK, Sylvester R | title = Construction and validation of a prognostic model across several studies, with an application in superficial bladder cancer | journal = Statistics in Medicine | volume = 23 | issue = 6 | pages = 907–926 | date = March 2004 | pmid = 15027080 | doi = 10.1002/sim.1691 | s2cid = 23397142 }} Here each of the k included studies in turn is omitted and compared with the summary estimate derived from aggregating the remaining k- 1 studies. A general validation statistic, Vn based on IOCV has been developed to measure the statistical validity of meta-analysis results.{{cite journal | vauthors = Willis BH, Riley RD | title = Measuring the statistical validity of summary meta-analysis and meta-regression results for use in clinical practice | journal = Statistics in Medicine | volume = 36 | issue = 21 | pages = 3283–3301 | date = September 2017 | pmid = 28620945 | pmc = 5575530 | doi = 10.1002/sim.7372 }} For test accuracy and prediction, particularly when there are multivariate effects, other approaches which seek to estimate the prediction error have also been proposed.{{cite journal | vauthors = Riley RD, Ahmed I, Debray TP, Willis BH, Noordzij JP, Higgins JP, Deeks JJ | title = Summarising and validating test accuracy results across multiple studies for use in clinical practice | journal = Statistics in Medicine | volume = 34 | issue = 13 | pages = 2081–2103 | date = June 2015 | pmid = 25800943 | pmc = 4973708 | doi = 10.1002/sim.6471 }}

Challenges

A meta-analysis of several small studies does not always predict the results of a single large study.{{cite journal | vauthors = LeLorier J, Grégoire G, Benhaddad A, Lapierre J, Derderian F | title = Discrepancies between meta-analyses and subsequent large randomized, controlled trials | journal = The New England Journal of Medicine | volume = 337 | issue = 8 | pages = 536–542 | date = August 1997 | pmid = 9262498 | doi = 10.1056/NEJM199708213370806 | doi-access = free }} Some have argued that a weakness of the method is that sources of bias are not controlled by the method: a good meta-analysis cannot correct for poor design or bias in the original studies.{{cite journal | vauthors = Slavin RE | title = Best-Evidence Synthesis: An Alternative to Meta-Analytic and Traditional Reviews | journal = Educational Researcher | volume = 15 | issue = 9 | pages = 5–9 | year = 1986 | doi = 10.3102/0013189X015009005| s2cid = 146457142 }} This would mean that only methodologically sound studies should be included in a meta-analysis, a practice called 'best evidence synthesis'. Other meta-analysts would include weaker studies, and add a study-level predictor variable that reflects the methodological quality of the studies to examine the effect of study quality on the effect size.{{cite book | vauthors = Hunter JE, Schmidt FL, Jackson GB | collaboration = American Psychological Association. Division of Industrial-Organizational Psychology |title=Meta-analysis: cumulating research findings across studies |date=1982 |publisher=Sage |location=Beverly Hills, California |isbn=978-0-8039-1864-1}} However, others have argued that a better approach is to preserve information about the variance in the study sample, casting as wide a net as possible, and that methodological selection criteria introduce unwanted subjectivity, defeating the purpose of the approach.{{cite book | vauthors = Glass GV, McGaw B, Smith ML|title=Meta-analysis in social research |date=1981 |publisher=Sage Publications |location=Beverly Hills, California |isbn=978-0-8039-1633-3}} More recently, and under the influence of a push for open practices in science, tools to develop "crowd-sourced" living meta-analyses that are updated by communities of scientists {{Cite journal |last1=Wolf |first1=Vinzent |last2=Kühnel |first2=Anne |last3=Teckentrup |first3=Vanessa |last4=Koenig |first4=Julian |last5=Kroemer |first5=Nils B. |date=2021 |title=Does transcutaneous auricular vagus nerve stimulation affect vagally mediated heart rate variability? A living and interactive Bayesian meta-analysis |journal=Psychophysiology |language=en |volume=58 |issue=11 |pages=e13933 |doi=10.1111/psyp.13933 |pmid=34473846 |issn=0048-5772|doi-access=free }}{{Cite journal |last1=Allbritton |first1=David |last2=Gómez |first2=Pablo |last3=Angele |first3=Bernhard |last4=Vasilev |first4=Martin |last5=Perea |first5=Manuel |date=2024-07-22 |title=Breathing Life Into Meta-Analytic Methods |journal=Journal of Cognition |language=en |volume=7 |issue=1 |page=61 |doi=10.5334/joc.389 |issn=2514-4820 |pmc=11276543 |pmid=39072210 |doi-access=free}} in hopes of making all the subjective choices more explicit.

=Publication bias: the file drawer problem=

File:Example of a symmetrical funnel plot created with MetaXL Sept 2015.jpg

File:Funnel plot depicting asymmetry Sept 2015.jpg

Another potential pitfall is the reliance on the available body of published studies, which may create exaggerated outcomes due to publication bias,{{Cite journal |last=Wagner |first=John A |date=2022-09-03 |title=The influence of unpublished studies on results of recent meta-analyses: publication bias, the file drawer problem, and implications for the replication crisis |url=https://www.tandfonline.com/doi/full/10.1080/13645579.2021.1922805 |journal=International Journal of Social Research Methodology |language=en |volume=25 |issue=5 |pages=639–644 |doi=10.1080/13645579.2021.1922805 |issn=1364-5579}} as studies which show negative results or insignificant results are less likely to be published.{{Cite journal | vauthors = Polanin JR, Tanner-Smith EE, Hennessy EA |date=2016 |title=Estimating the Difference Between Published and Unpublished Effect Sizes: A Meta-Review |url=http://journals.sagepub.com/doi/10.3102/0034654315582067 |journal=Review of Educational Research |language=en |volume=86 |issue=1 |pages=207–236 |doi=10.3102/0034654315582067 |s2cid=145513046 |issn=0034-6543}} For example, pharmaceutical companies have been known to hide negative studies{{Cite journal |last1=Nassir Ghaemi |first1=S. |last2=Shirzadi |first2=Arshia A. |last3=Filkowski |first3=Megan |date=2008-09-10 |title=Publication Bias and the Pharmaceutical Industry: The Case of Lamotrigine in Bipolar Disorder |journal=The Medscape Journal of Medicine |volume=10 |issue=9 |pages=211 |issn=1934-1997 |pmc=2580079 |pmid=19008973}} and researchers may have overlooked unpublished studies such as dissertation studies or conference abstracts that did not reach publication.{{Cite journal |last1=Martin |first1=José Luis R. |last2=Pérez |first2=Víctor |last3=Sacristán |first3=Montse |last4=Álvarez |first4=Enric |date=2005 |title=Is grey literature essential for a better control of publication bias in psychiatry? An example from three meta-analyses of schizophrenia |url=https://www.cambridge.org/core/product/identifier/S0924933800066967/type/journal_article |journal=European Psychiatry |language=en |volume=20 |issue=8 |pages=550–553 |doi=10.1016/j.eurpsy.2005.03.011 |pmid=15994063 |issn=0924-9338}} This is not easily solved, as one cannot know how many studies have gone unreported.{{cite journal |doi=10.1037/0033-2909.86.3.638 |year=1979 | vauthors = Rosenthal R |author-link=Robert Rosenthal (psychologist) |title=The "File Drawer Problem" and the Tolerance for Null Results |journal=Psychological Bulletin |volume=86 |issue=3 |pages=638–641|s2cid=36070395 }}

This file drawer problem characterized by negative or non-significant results being tucked away in a cabinet, can result in a biased distribution of effect sizes thus creating a serious base rate fallacy, in which the significance of the published studies is overestimated, as other studies were either not submitted for publication or were rejected. This should be seriously considered when interpreting the outcomes of a meta-analysis.{{Cite book|year=1990 | vauthors = Hunter JE, Schmidt FL |author-link1=John E. Hunter |author-link2=Frank L. Schmidt |title=Methods of Meta-Analysis: Correcting Error and Bias in Research Findings |place=Newbury Park, California; London; New Delhi |publisher=SAGE Publications}}

The distribution of effect sizes can be visualized with a funnel plot which (in its most common version) is a scatter plot of standard error versus the effect size.{{Cite journal |last1=Nakagawa |first1=Shinichi |last2=Lagisz |first2=Malgorzata |last3=Jennions |first3=Michael D. |last4=Koricheva |first4=Julia |last5=Noble |first5=Daniel W. A. |last6=Parker |first6=Timothy H. |last7=Sánchez-Tójar |first7=Alfredo |last8=Yang |first8=Yefeng |last9=O'Dea |first9=Rose E. |date=2022 |title=Methods for testing publication bias in ecological and evolutionary meta-analyses |url=https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.13724 |journal=Methods in Ecology and Evolution |language=en |volume=13 |issue=1 |pages=4–21 |doi=10.1111/2041-210X.13724 |bibcode=2022MEcEv..13....4N |hdl=1885/294436 |s2cid=241159497 |issn=2041-210X|hdl-access=free }} It makes use of the fact that the smaller studies (thus larger standard errors) have more scatter of the magnitude of effect (being less precise) while the larger studies have less scatter and form the tip of the funnel. If many negative studies were not published, the remaining positive studies give rise to a funnel plot in which the base is skewed to one side (asymmetry of the funnel plot). In contrast, when there is no publication bias, the effect of the smaller studies has no reason to be skewed to one side and so a symmetric funnel plot results. This also means that if no publication bias is present, there would be no relationship between standard error and effect size.{{cite book | vauthors = Light RJ, Pillemer DB |title=Summing up: the science of reviewing research |date=1984 |publisher=Harvard University Press |location=Cambridge, Massachusetts |isbn=978-0-674-85431-4 |url=https://archive.org/details/summingupscience00ligh }} A negative or positive relation between standard error and effect size would imply that smaller studies that found effects in one direction only were more likely to be published and/or to be submitted for publication.

Apart from the visual funnel plot, statistical methods for detecting publication bias have also been proposed.{{cite journal | vauthors = Vevea JL, Woods CM | title = Publication bias in research synthesis: sensitivity analysis using a priori weight functions | journal = Psychological Methods | volume = 10 | issue = 4 | pages = 428–443 | date = December 2005 | pmid = 16392998 | doi = 10.1037/1082-989X.10.4.428 }} These are controversial because they typically have low power for detection of bias, but also may make false positives under some circumstances.{{cite journal | vauthors = Ioannidis JP, Trikalinos TA | title = The appropriateness of asymmetry tests for publication bias in meta-analyses: a large survey | journal = CMAJ | volume = 176 | issue = 8 | pages = 1091–1096 | date = April 2007 | pmid = 17420491 | pmc = 1839799 | doi = 10.1503/cmaj.060410 }} For instance small study effects (biased smaller studies), wherein methodological differences between smaller and larger studies exist, may cause asymmetry in effect sizes that resembles publication bias. However, small study effects may be just as problematic for the interpretation of meta-analyses, and the imperative is on meta-analytic authors to investigate potential sources of bias.{{Cite journal | vauthors = Hedges LV, Vevea JL |date=1996 |title=Estimating Effect Size Under Publication Bias: Small Sample Properties and Robustness of a Random Effects Selection Model |url=http://journals.sagepub.com/doi/10.3102/10769986021004299 |journal=Journal of Educational and Behavioral Statistics |language=en |volume=21 |issue=4 |pages=299–332 |doi=10.3102/10769986021004299 |s2cid=123680599 |issn=1076-9986}}

The problem of publication bias is not trivial as it is suggested that 25% of meta-analyses in the psychological sciences may have suffered from publication bias.{{cite journal |vauthors=Ferguson CJ, Brannick MT |date=March 2012 |title=Publication bias in psychological science: prevalence, methods for identifying and controlling, and implications for the use of meta-analyses |journal=Psychological Methods |volume=17 |issue=1 |pages=120–128 |doi=10.1037/a0024445 |pmid=21787082}} However, low power of existing tests and problems with the visual appearance of the funnel plot remain an issue, and estimates of publication bias may remain lower than what truly exists.

Most discussions of publication bias focus on journal practices favoring publication of statistically significant findings. However, questionable research practices, such as reworking statistical models until significance is achieved, may also favor statistically significant findings in support of researchers' hypotheses.{{cite journal | vauthors = Simmons JP, Nelson LD, Simonsohn U | title = False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant | journal = Psychological Science | volume = 22 | issue = 11 | pages = 1359–1366 | date = November 2011 | pmid = 22006061 | doi = 10.1177/0956797611417632 | doi-access = free }}{{Cite journal |year=2011 | vauthors = LeBel E, Peters K |title=Fearing the future of empirical psychology: Bem's (2011) evidence of psi as a case study of deficiencies in modal research practice |journal=Review of General Psychology |volume=15 |issue=4 |pages=371–379 |url=http://publish.uwo.ca/~elebel/documents/l&p(2011,rgp).pdf |doi=10.1037/a0025172 |s2cid=51686730 |url-status=dead |archive-url=https://web.archive.org/web/20121124154834/http://publish.uwo.ca/~elebel/documents/l%26p%282011%2Crgp%29.pdf |archive-date=24 November 2012}}

=Problems arising from agenda-driven bias=

The most severe fault in meta-analysis often occurs when the person or persons doing the meta-analysis have an economic, social, or political agenda such as the passage or defeat of legislation.{{Cite news |title=Research into trans medicine has been manipulated |url=https://www.economist.com/united-states/2024/06/27/research-into-trans-medicine-has-been-manipulated |access-date=2024-09-28 |newspaper=The Economist |issn=0013-0613}} People with these types of agendas may be more likely to abuse meta-analysis due to personal bias. For example, researchers favorable to the author's agenda are likely to have their studies cherry-picked while those not favorable will be ignored or labeled as "not credible". In addition, the favored authors may themselves be biased or paid to produce results that support their overall political, social, or economic goals in ways such as selecting small favorable data sets and not incorporating larger unfavorable data sets. The influence of such biases on the results of a meta-analysis is possible because the methodology of meta-analysis is highly malleable.{{cite journal | vauthors = Stegenga J | title = Is meta-analysis the platinum standard of evidence? | journal = Studies in History and Philosophy of Biological and Biomedical Sciences | volume = 42 | issue = 4 | pages = 497–507 | date = December 2011 | pmid = 22035723 | doi = 10.1016/j.shpsc.2011.07.003 | url = https://philpapers.org/rec/STEIMT | author-link = Stegenga J }}

A 2011 study done to disclose possible conflicts of interests in underlying research studies used for medical meta-analyses reviewed 29 meta-analyses and found that conflicts of interests in the studies underlying the meta-analyses were rarely disclosed. The 29 meta-analyses included 11 from general medicine journals, 15 from specialty medicine journals, and three from the Cochrane Database of Systematic Reviews. The 29 meta-analyses reviewed a total of 509 randomized controlled trials (RCTs). Of these, 318 RCTs reported funding sources, with 219 (69%) receiving funding from industry (i.e. one or more authors

having financial ties to the pharmaceutical industry). Of the 509 RCTs, 132 reported author conflict of interest disclosures, with 91 studies (69%) disclosing one or more authors having financial ties to industry. The information was, however, seldom reflected in the meta-analyses. Only two (7%) reported RCT funding sources and none reported RCT author-industry ties. The authors concluded "without acknowledgment of COI due to industry funding or author industry financial ties from RCTs included in meta-analyses, readers' understanding and appraisal of the evidence from the meta-analysis may be compromised."{{citation | title=Reporting of Conflicts of Interest in Meta-analyses of Trials of Pharmacological Treatments | journal = Journal of the American Medical Association | vauthors = Roseman M, Milette K, Bero LA, Coyne JC, Lexchin J, Turner EH, Thombs BD | volume = 305 | issue = 10 | pages = 1008–1017 | year = 2011 | doi = 10.1001/jama.2011.257 | pmid = 21386079 | url = https://www.rug.nl/research/portal/en/publications/reporting-of-conflicts-of-interest-in-metaanalyses-of-trials-of-pharmacological-treatments(d4a95ee2-429f-45a4-a917-d794ee954797).html | hdl = 11370/d4a95ee2-429f-45a4-a917-d794ee954797 | s2cid = 11270323 | hdl-access = free }}

For example, in 1998, a US federal judge found that the United States Environmental Protection Agency had abused the meta-analysis process to produce a study claiming cancer risks to non-smokers from environmental tobacco smoke (ETS) with the intent to influence policy makers to pass smoke-free–workplace laws.{{Cite journal |last=Spink |first=Paul |date=1999 |title=Challenging Environmental Tobacco Smoke in the Workplace |url=https://journals.sagepub.com/doi/10.1177/146145299900100402 |journal=Environmental Law Review |language=en |volume=1 |issue=4 |pages=243–265 |doi=10.1177/146145299900100402 |bibcode=1999EnvLR...1..243S |issn=1461-4529}}{{Cite web |last=Will |first=George |date=1998-07-30 |title=Polluted by the anti-tobacco crusade |url=https://www.tampabay.com/archive/1998/07/30/polluted-by-the-anti-tobacco-crusade/ |access-date=2024-09-28 |website=Tampa Bay Times |language=en}}{{Cite journal |last1=Nelson |first1=Jon P. |last2=Kennedy |first2=Peter E. |date=2009 |title=The Use (and Abuse) of Meta-Analysis in Environmental and Natural Resource Economics: An Assessment |url=https://link.springer.com/article/10.1007/s10640-008-9253-5 |journal=Environmental and Resource Economics |language=en |volume=42 |issue=3 |pages=345–377 |doi=10.1007/s10640-008-9253-5 |bibcode=2009EnREc..42..345N |issn=0924-6460}}

=Comparability and validity of included studies=

Meta-analysis may often not be a substitute for an adequately powered primary study, particularly in the biological sciences.{{cite journal | vauthors = Munafò MR, Flint J | title = Meta-analysis of genetic association studies | journal = Trends in Genetics | volume = 20 | issue = 9 | pages = 439–444 | date = September 2004 | pmid = 15313553 | doi = 10.1016/j.tig.2004.06.014 }}

Heterogeneity of methods used may lead to faulty conclusions.{{cite journal | vauthors = Stone DL, Rosopa PJ |title=The Advantages and Limitations of Using Meta-analysis in Human Resource Management Research |journal=Human Resource Management Review |date=1 March 2017 |volume=27 |issue=1 |pages=1–7 |doi=10.1016/j.hrmr.2016.09.001 |language=en |issn=1053-4822}} For instance, differences in the forms of an intervention or the cohorts that are thought to be minor or are unknown to the scientists could lead to substantially different results, including results that distort the meta-analysis' results or are not adequately considered in its data. Vice versa, results from meta-analyses may also make certain hypothesis or interventions seem nonviable and preempt further research or approvals, despite certain modifications – such as intermittent administration, personalized criteria and combination measures – leading to substantially different results, including in cases where such have been successfully identified and applied in small-scale studies that were considered in the meta-analysis.{{citation needed|date=January 2022}} Standardization, reproduction of experiments, open data and open protocols may often not mitigate such problems, for instance as relevant factors and criteria could be unknown or not be recorded.{{citation needed|date=January 2022}}

There is a debate about the appropriate balance between testing with as few animals or humans as possible and the need to obtain robust, reliable findings. It has been argued that unreliable research is inefficient and wasteful and that studies are not just wasteful when they stop too late but also when they stop too early. In large clinical trials, planned, sequential analyses are sometimes used if there is considerable expense or potential harm associated with testing participants.{{cite journal | vauthors = Button KS, Ioannidis JP, Mokrysz C, Nosek BA, Flint J, Robinson ES, Munafò MR | title = Power failure: why small sample size undermines the reliability of neuroscience | journal = Nature Reviews. Neuroscience | volume = 14 | issue = 5 | pages = 365–376 | date = May 2013 | pmid = 23571845 | doi = 10.1038/nrn3475 | s2cid = 455476 | doi-access = free }} In applied behavioural science, "megastudies" have been proposed to investigate the efficacy of many different interventions designed in an interdisciplinary manner by separate teams.{{cite journal | vauthors = Milkman KL, Gromet D, Ho H, Kay JS, Lee TW, Pandiloski P, Park Y, Rai A, Bazerman M, Beshears J, Bonacorsi L, Camerer C, Chang E, Chapman G, Cialdini R, Dai H, Eskreis-Winkler L, Fishbach A, Gross JJ, Horn S, Hubbard A, Jones SJ, Karlan D, Kautz T, Kirgios E, Klusowski J, Kristal A, Ladhania R, Loewenstein G, Ludwig J, Mellers B, Mullainathan S, Saccardo S, Spiess J, Suri G, Talloen JH, Taxer J, Trope Y, Ungar L, Volpp KG, Whillans A, Zinman J, Duckworth AL | display-authors = 6 | title = Megastudies improve the impact of applied behavioural science | journal = Nature | volume = 600 | issue = 7889 | pages = 478–483 | date = December 2021 | pmid = 34880497 | doi = 10.1038/s41586-021-04128-4 | pmc = 8822539 | s2cid = 245047340 | bibcode = 2021Natur.600..478M | author40-link = Kevin Volpp }} One such study used a fitness chain to recruit a large number participants. It has been suggested that behavioural interventions are often hard to compare [in meta-analyses and reviews], as "different scientists test different intervention ideas in different samples using different outcomes over different time intervals", causing a lack of comparability of such individual investigations which limits "their potential to inform policy".

=Weak inclusion standards lead to misleading conclusions=

Meta-analyses in education are often not restrictive enough in regards to the methodological quality of the studies they include. For example, studies that include small samples or researcher-made measures lead to inflated effect size estimates.{{Cite journal| vauthors = Cheung AC, Slavin RE |date=2016-06-01|title=How Methodological Features Affect Effect Sizes in Education |journal=Educational Researcher|language=en|volume=45|issue=5|pages=283–292|doi=10.3102/0013189X16656615|s2cid=148531062|issn=0013-189X}} However, this problem also troubles meta-analysis of clinical trials. The use of different quality assessment tools (QATs) lead to including different studies and obtaining conflicting estimates of average treatment effects.{{cite journal | vauthors = Jüni P, Witschi A, Bloch R, Egger M | title = The hazards of scoring the quality of clinical trials for meta-analysis | journal = JAMA | volume = 282 | issue = 11 | pages = 1054–1060 | date = September 1999 | pmid = 10493204 | doi = 10.1001/jama.282.11.1054 | doi-access = free }}{{cite journal | vauthors = Armijo-Olivo S, Fuentes J, Ospina M, Saltaji H, Hartling L | title = Inconsistency in the items included in tools used in general health research and physical therapy to evaluate the methodological quality of randomized controlled trials: a descriptive analysis | journal = BMC Medical Research Methodology | volume = 13 | issue = 1 | pages = 116 | date = September 2013 | pmid = 24044807 | pmc = 3848693 | doi = 10.1186/1471-2288-13-116 | doi-access = free }}

Applications in modern science

File:Integrated Molecular Meta-Analysis of 1,000 Pediatric High-Grade and Diffuse Intrinsic Pontine Glioma - graphical abstract.jpg and other pediatric gliomas, in which information about the mutations involved as well as generic outcomes were distilled from the underlying primary literature]]Modern statistical meta-analysis does more than just combine the effect sizes of a set of studies using a weighted average. It can test if the outcomes of studies show more variation than the variation that is expected because of the sampling of different numbers of research participants. Additionally, study characteristics such as measurement instrument used, population sampled, or aspects of the studies' design can be coded and used to reduce variance of the estimator (see statistical models above). Thus some methodological weaknesses in studies can be corrected statistically. Other uses of meta-analytic methods include the development and validation of clinical prediction models, where meta-analysis may be used to combine individual participant data from different research centers and to assess the model's generalisability,{{cite journal | vauthors = Debray TP, Riley RD, Rovers MM, Reitsma JB, Moons KG | title = Individual participant data (IPD) meta-analyses of diagnostic and prognostic modeling studies: guidance on their use | journal = PLOS Medicine | volume = 12 | issue = 10 | pages = e1001886 | date = October 2015 | pmid = 26461078 | pmc = 4603958 | doi = 10.1371/journal.pmed.1001886 | doi-access = free }}{{cite journal | vauthors = Debray TP, Moons KG, Ahmed I, Koffijberg H, Riley RD | title = A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta-analysis | journal = Statistics in Medicine | volume = 32 | issue = 18 | pages = 3158–3180 | date = August 2013 | pmid = 23307585 | doi = 10.1002/sim.5732 | s2cid = 25308961 | url = https://ris.utwente.nl/ws/files/16298373/A_framework_for_developing.pdf }} or even to aggregate existing prediction models.{{cite journal | vauthors = Debray TP, Koffijberg H, Vergouwe Y, Moons KG, Steyerberg EW | title = Aggregating published prediction models with individual participant data: a comparison of different approaches | journal = Statistics in Medicine | volume = 31 | issue = 23 | pages = 2697–2712 | date = October 2012 | pmid = 22733546 | doi = 10.1002/sim.5412 | s2cid = 39439611 | url = https://ris.utwente.nl/ws/files/16299610/Debray_et_al_2012_Statistics_in_Medicine.pdf }}

Meta-analysis can be done with single-subject design as well as group research designs.{{Cite journal |last=Shadish |first=William R. |date=2014 |title=Analysis and meta-analysis of single-case designs: An introduction |url=https://linkinghub.elsevier.com/retrieve/pii/S0022440513001118 |journal=Journal of School Psychology |language=en |volume=52 |issue=2 |pages=109–122 |doi=10.1016/j.jsp.2013.11.009|pmid=24606971 }} This is important because much research has been done with single-subject research designs.{{Cite journal |last1=Zelinsky |first1=Nicole A. M. |last2=Shadish |first2=William |date=2018-05-19 |title=A demonstration of how to do a meta-analysis that combines single-case designs with between-groups experiments: The effects of choice making on challenging behaviors performed by people with disabilities |url=https://www.tandfonline.com/doi/full/10.3109/17518423.2015.1100690 |journal=Developmental Neurorehabilitation |language=en |volume=21 |issue=4 |pages=266–278 |doi=10.3109/17518423.2015.1100690 |pmid=26809945 |s2cid=20442353 |issn=1751-8423}} Considerable dispute exists for the most appropriate meta-analytic technique for single subject research.{{cite journal |vauthors=Van den Noortgate W, Onghena P | year = 2007 | title = Aggregating Single-Case Results | journal = The Behavior Analyst Today | volume = 8 | issue = 2 | pages = 196–209 | url = https://www.questia.com/read/1G1-170115042/the-aggregation-of-single-case-results-using-hierarchical | doi=10.1037/h0100613}}

Meta-analysis leads to a shift of emphasis from single studies to multiple studies. It emphasizes the practical importance of the effect size instead of the statistical significance of individual studies. This shift in thinking has been termed "meta-analytic thinking". The results of a meta-analysis are often shown in a forest plot.

Results from studies are combined using different approaches. One approach frequently used in meta-analysis in health care research is termed 'inverse variance method'. The average effect size across all studies is computed as a weighted mean, whereby the weights are equal to the inverse variance of each study's effect estimator. Larger studies and studies with less random variation are given greater weight than smaller studies. Other common approaches include the Mantel–Haenszel method{{cite journal | vauthors = Mantel N, Haenszel W | title = Statistical aspects of the analysis of data from retrospective studies of disease | journal = Journal of the National Cancer Institute | volume = 22 | issue = 4 | pages = 719–748 | date = April 1959 | pmid = 13655060 | doi = 10.1093/jnci/22.4.719 | s2cid = 17698270 }} and the Peto method.{{cite book | vauthors = Deeks JJ, Higgins JP, Altman DG | collaboration = Cochrane Statistical Methods Group | chapter = Chapter 10: Analysing data and undertaking meta-analyses: 10.4.2 Peto odds ratio method| chapter-url = https://training.cochrane.org/handbook/current/chapter-10#section-10-4-2 | veditors = Higgins J, Thomas J, Chandler J, Cumpston M, Li T, Page M, Welch V | title = Cochrane Handbook for Systematic Reviews of Interventions | edition = Version 6.2 | date = 2021 | publisher = The Cochrane Collaboration }}

Seed-based d mapping (formerly signed differential mapping, SDM) is a statistical technique for meta-analyzing studies on differences in brain activity or structure which used neuroimaging techniques such as fMRI, VBM or PET.

Different high throughput techniques such as microarrays have been used to understand Gene expression. MicroRNA expression profiles have been used to identify differentially expressed microRNAs in particular cell or tissue type or disease conditions or to check the effect of a treatment. A meta-analysis of such expression profiles was performed to derive novel conclusions and to validate the known findings.{{cite journal | vauthors = Bargaje R, Hariharan M, Scaria V, Pillai B | title = Consensus miRNA expression profiles derived from interplatform normalization of microarray data | journal = RNA | volume = 16 | issue = 1 | pages = 16–25 | date = January 2010 | pmid = 19948767 | pmc = 2802026 | doi = 10.1261/rna.1688110 }}

Meta-analysis of whole genome sequencing studies provides an attractive solution to the problem of collecting large sample sizes for discovering rare variants associated with complex phenotypes. Some methods have been developed to enable functionally informed rare variant association meta-analysis in biobank-scale cohorts using efficient approaches for summary statistic storage.{{cite journal |last1=Li |first1=Xihao |last2=Quick |first2=Corbin |last3=Zhou |first3=Hufeng |last4=Gaynor |first4=Sheila M. |last5=Liu |first5=Yaowu |last6=Chen |first6=Han |last7=Selvaraj |first7=Margaret Sunitha |last8=Sun |first8=Ryan |last9=Dey |first9=Rounak |last10=Arnett |first10=Donna K. |last11=Bielak |first11=Lawrence F. |last12=Bis |first12=Joshua C. |last13=Blangero |first13=John |last14=Boerwinkle |first14=Eric |last15=Bowden |first15=Donald W. |last16=Brody |first16=Jennifer A. |last17=Cade |first17=Brian E. |last18=Correa |first18=Adolfo |last19=Cupples |first19=L. Adrienne |last20=Curran |first20=Joanne E. |last21=de Vries |first21=Paul S. |last22=Duggirala |first22=Ravindranath |last23=Freedman |first23=Barry I. |last24=Göring |first24=Harald H. H. |last25=Guo |first25=Xiuqing |last26=Haessler |first26=Jeffrey |last27=Kalyani |first27=Rita R. |last28=Kooperberg |first28=Charles |last29=Kral |first29=Brian G. |last30=Lange |first30=Leslie A. |last31=Manichaikul |first31=Ani |last32=Martin |first32=Lisa W. |last33=McGarvey |first33=Stephen T. |last34=Mitchell |first34=Braxton D. |last35=Montasser |first35=May E. |last36=Morrison |first36=Alanna C. |last37=Naseri |first37=Take |last38=O’Connell |first38=Jeffrey R. |last39=Palmer |first39=Nicholette D. |last40=Peyser |first40=Patricia A. |last41=Psaty |first41=Bruce M. |last42=Raffield |first42=Laura M. |last43=Redline |first43=Susan |last44=Reiner |first44=Alexander P. |last45=Reupena |first45=Muagututi’a Sefuiva |last46=Rice |first46=Kenneth M. |last47=Rich |first47=Stephen S. |last48=Sitlani |first48=Colleen M. |last49=Smith |first49=Jennifer A. |last50=Taylor |first50=Kent D. |last51=Vasan |first51=Ramachandran S. |last52=Willer |first52=Cristen J. |last53=Wilson |first53=James G. |last54=Yanek |first54=Lisa R. |last55=Zhao |first55=Wei

|last56=NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium|last57=TOPMed Lipids Working Group

|last58=Rotter |first58=Jerome I. |last59=Natarajan |first59=Pradeep |last60=Peloso |first60=Gina M. |last61=Li |first61=Zilin |last62=Lin |first62=Xihong |title=Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies |journal=Nature Genetics |date=January 2023 |volume=55 |issue=1 |pages=154–164 |doi=10.1038/s41588-022-01225-6|pmid=36564505 |pmc=10084891 |s2cid=255084231 }}

Sweeping meta-analyses can also be used to estimate a network of effects. This allows researchers to examine patterns in the fuller panorama of more accurately estimated results and draw conclusions that consider the broader context (e.g., how personality-intelligence relations vary by trait family).{{Cite book |last1=Stanek |first1=Kevin C. |url=https://umnlibraries.manifoldapp.org/projects/of-anchors-and-sails |title=Of Anchors & Sails: Personality-ability trait constellations |last2=Ones |first2=Deniz S. |publisher=University of Minnesota Libraries Publishing |year=2023 |location=Minneapolis, Minnesota, United States |pages=Chapters 4–7 |doi=10.24926/9781946135988|isbn=978-1-946135-98-8 |s2cid=265335858 }}

Software

R Package: metafor & meta, RevMan, JASP, Jamovi,

StatsDirect,

MetaEssential,

Comprehensive meta-analysis

Sources

{{Creative Commons text attribution notice|cc=by4|url=https://www.frontiersin.org/articles/10.3389/fpsyg.2015.01549/full|author(s)=Daniel S. Quintana}}

{{Creative Commons text attribution notice|cc=by3|url=https://www.jstatsoft.org/article/view/v036i03|author(s)=Wolfgang Viechtbauer}}

References

{{Reflist|30em|refs={{cite journal | vauthors = Zhang H, Deng L, Schiffman M, Qin J, Yu K | title = Generalized integration model for improved statistical inference by leveraging external summary data | journal = Biometrika | date = 2020 | volume = 107 | issue = 3 | pages = 689–703 | doi = 10.1093/biomet/asaa014 }}

}}

Category:Evidence-based practices

Category:Systematic review