Mega2, the Manipulation Environment for Genetic Analysis
{{Infobox software
| name = Mega2
| logo =
| logo caption =
| logo_size =
| logo_alt =
| screenshot =
| caption =
| screenshot_size =
| screenshot_alt =
| collapsible =
| author = Previous Programmers: Charles P. Kollar, Nandita Mukhopadhyay, Lee Almasy, Mark Schroeder, William P. Mulvihill.
| developer = Daniel E. Weeks, Robert V. Baron, Justin R. Stickel.
| released = {{Start date and age|2000|1|16|df=yes}}
| discontinued =
| latest release version = 5.0.1
| latest release date = {{Start date and age|2018|12|13|df=yes}}
| status =
| programming language = C++
| operating system = Linux, Mac OS X, Microsoft Windows
| platform =
| size =
| language =
| language count =
| language footnote =
| genre = Applied statistical genetics, Bioinformatics
| license = GNU General Public License version 3
| alexa =
| website = {{URL|https://watson.hgen.pitt.edu/register/}}
| standard =
| AsOf =
}}
Mega2 is a data manipulation software for applied statistical genetics. Mega is an acronym for Manipulation Environment for Genetic Analysis.
The software allows the applied statistical geneticist to convert one's data from several input formats to a large number output formats suitable for analysis by commonly used software packages.{{cite journal|last=Mukhopadhyay|first=N|author2=Almasy L |author3=Schroeder M |author4=Mulvihill WP |author5=Weeks DE |title=Mega2, a data-handling program for facilitating genetic linkage and association analyses|journal=Am J Hum Genet|date=1999|volume=65|page=A436}}{{cite journal|last=Mukhopadhyay|first=N|author2=Almasy L |author3=Schroeder M |author4=Mulvihill WP |author5=Weeks DE |title=Mega2: data-handling for facilitating genetic linkage and association analyses|journal=Bioinformatics|date=2005|volume=21|issue=10|pages=2556–2557|pmid=15746282 |doi=10.1093/bioinformatics/bti364|doi-access=free}}{{cite journal|last=Kollar|first=CP|author2=Baron RV |author3=Mukhopadhyay N |author4=Weeks DE |title=Mega2: enhanced data-handling for facilitating genetic linkage and association analyses|journal=Presented at the 63rd Annual Meeting of the American Society of Human Genetics, Boston|date=October 2013|page=Abstract 1831|url=http://abstracts.ashg.org/cgi-bin/2013/ashg13s.pl?author=kollar&sort=ptimes&sbutton=Detail&absno=130121140&sid=32111}}{{cite journal |vauthors=Baron RV, Kollar C, Mukhopadhyay N, Weeks DE | title=Mega2: validated data-reformatting for linkage and association analyses | journal=Source Code Biol Med | date=2014 | volume=9 | issue=1|pages=26|pmc=4269913 | doi=10.1186/s13029-014-0026-y | pmid=25687422 | doi-access=free }} In a typical human genetics study, the analyst often needs to use a variety of different software programs to analyze the data, and these programs usually require that the data be formatted to their precise input specifications. Conversion of one's data into these multiple different formats can be tedious, time-consuming, and error-prone. Mega2, by providing validated conversion pipelines, can accelerate the analyses while reducing errors.
Mega2 produces a common intermediate data representation using SQLite3, which enables the data to be accessed by other programs and languages. In particular, the [https://cran.r-project.org/package=Mega2R Mega2R] R package converts the SQLite3 data into R data frames. Several R functions are provided that illustrate how data can be extracted from the data frames for common R analysis, such as [https://cran.r-project.org/package=SKAT SKAT] and [https://cran.r-project.org/package=pedgene pedgene]. The key is being able to efficiently extract genotypes corresponding to chosen subsets of markers so as to facilitate gene-based association testing by automating looping over genes in the genome. Another function converts to VCF format and another converts the data to [https://cran.r-project.org/package=GenABEL GenABEL] format. For more information about the Mega2R package, see [https://watson.hgen.pitt.edu/mega2/mega2r/ here].
Mega2 has been used to facilitate genetic analyses of a wide variety of human traits, including hereditary dystonia,{{cite journal |vauthors=Hersheson J, Mencacci NE, Davis M, Macdonald N, Trabzuni D, Ryten M, Pittman A, Paudel R, Kara E, Fawcett K, Plagnol V, Bhatia KP, Medlar AJ, Stanescu HC, Hardy J, Kleta R, Wood NW, Houlden H | title=Mutations in the autoregulatory domain of beta-tubulin 4a cause hereditary dystonia | journal=Ann Neurol | date=2013 | volume=73 | issue=4|pages=546–553 | doi=10.1002/ana.23832 | pmid=23424103 | pmc=3698699}} Ehlers-Danlos syndrome,{{cite journal |vauthors=Baumann M, Giunta C, Krabichler B, Ruschendorf F, Zoppi N, Colombi M, Bittner RE, Quijano-Roy S, Muntoni F, Cirak S, Schreiber G, Zou Y, Hu Y, Romero NB, Carlier RY, Amberger A, Deutschmann A, Straub V, Rohrbach M, Steinmann B, Rostasy K, Karall D, Bonnemann CG, Zschocke J, Fauth C | title=Mutations in FKBP14 cause a variant of Ehlers-Danlos syndrome with progressive kyphoscoliosis, myopathy, and hearing loss | journal=Am J Hum Genet | date=2012 | volume=90 | issue=2|pages=201–216 | doi=10.1016/j.ajhg.2011.12.004 | pmid=22265013 | pmc=3276673}} multiple sclerosis,{{cite journal |vauthors=Dyment DA, Cader MZ, Chao MJ, Lincoln MR, Morrison KM, Disanto G, Morahan JM, De Luca GC, Sadovnick AD, Lepage P, Montpetit A, Ebers GC, Ramagopalan SV | title=Exome sequencing identifies a novel multiple sclerosis susceptibility variant in the TYK2 gene | journal=Neurology | date=2012 | volume=79 | issue=5|pages=406–411 | doi=10.1212/wnl.0b013e3182616fc4 | pmid=22744673 | pmc=3405256}} and gliomas.{{cite journal |vauthors=Shete S, Lau CC, Houlston RS, Claus EB, Barnholtz-Sloan J, Lai R, Il'yasova D, Schildkraut J, Sadetzki S, Johansen C, Bernstein JL, Olson SH, Jenkins RB, Yang P, Vick NA, Wrensch M, Davis FG, McCarthy BJ, Leung EH, Davis C, Cheng R, Hosking FJ, Armstrong GN, Liu Y, Yu RK, Henriksson R, Gliogene C, Melin BS, Bondy ML | title=Genome-wide high-density SNP linkage search for glioma susceptibility loci: results from the Gliogene Consortium | journal=Cancer Res | date=2011 | volume=71 | issue=24|pages=7568–7575 | doi=10.1158/0008-5472.can-11-0013 | pmid=22037877 | pmc=3242820}} A list of PubMed Central articles citing Mega2 can be seen [https://www.ncbi.nlm.nih.gov/pubmed?linkname=pubmed_pubmed_citedin&from_uid=15746282 here].
Mega2, which focusses on data reformatting, should not be confused with the MEGA, Molecular Evolutionary Genetics Analysis program, which focuses on molecular evolution and phylogenetics.
Input file formats
Mega2 accepts input data in a variety of widely used file formats. These contain, at a minimum, data about the phenotypes, the marker genotypes, any family structures, and map positions of the markers.
Output file formats
Mega2 supports conversion to the following output formats.
Documentation
The Mega2 documentation is available [https://watson.hgen.pitt.edu/docs/mega2_html/mega2.html here] in HTML format, and [https://watson.hgen.pitt.edu/docs/mega2_html/Mega2_Documentation.pdf here] in PDF format.
References
{{reflist|30em}}
External links
- [https://watson.hgen.pitt.edu/register Download Mega2]
- [https://watson.hgen.pitt.edu/docs/mega2_html/mega2.html Mega2 documentation (HTML)]
- [https://watson.hgen.pitt.edu/docs/mega2_html/Mega2_Documentation.pdf Mega2 documentation (PDF)]
- [https://groups.google.com/forum/#!forum/mega2-users Mega2 users Google Group]
- [https://bitbucket.org/dweeks/mega2 Mega2 bitbucket repository]
- [https://watson.hgen.pitt.edu/mega2/mega2r/ The Mega2R R package]