PLINK (genetic tool-set)
{{Short description|Whole genome association analysis toolset}}
{{multiple issues|
{{notability|date=July 2014}}
{{third-party|date=July 2014}}
}}
PLINK{{cite journal|author1=Purcell S |author2=Neale B |author3=Todd-Brown K |author4=Thomas L |author5=Ferreira MAR |author6=Bender D |author7=Maller J |author8=Sklar P |author9=de Bakker PIW |author10=Daly MJ |author11=Sham PC |title=PLINK: a toolset for whole-genome association and population-based linkage analysis.|journal=American Journal of Human Genetics|year=2007|volume=81|issue=3 |pmc=1950838|pmid=17701901|doi=10.1086/519795|pages=559–75}} is a free, commonly used, open-source whole-genome association analysis toolset designed by Shaun Purcell. The software is designed flexibly to perform a wide range of basic, large-scale genetic analyses.
PLINK currently supports following functionalities:
- data management;
- basic statistics (FST, missing data, tests of Hardy–Weinberg equilibrium, inbreeding coefficient, etc.);
- Linkage disequilibrium (LD) calculation;
- Identity by descent (IBD) and identity by state (IBS) matrix calculation;
- population stratification, such as a Principal component analysis;
- association analysis such as genome-wide association study for both basic case/control studies and quantitative traits;
- tests for epistasis
Input and output files
PLINK has its own format of text files ({{Mono|.ped}}) and binary text files ({{Mono|.bed}}) that serve as input files for most analyses.{{cite web |url=https://biobank.ndph.ox.ac.uk/ukb/ukb/docs/plink19formats.pdf|title=PLINK 1.9 File format reference |author=Christopher Chang |date=2017 |publisher=Biobank UK at University of Oxford |access-date=2022-08-05 |quote=PLINK input and output file formats which are identifiable by file extension}} A .map accompanies a {{Mono|.ped}} file and provides information about variants, while {{Mono|.bim}} and {{Mono|.fam}} files accompany {{Mono|.bed}} files as part of the binary dataset. Additionally, PLINK accepts inputs of VCF, BCF, Oxford, and 23andMe files, which are typically extracted into the binary {{Mono|.bed}} format prior to performing desired analyses. With certain formats such as VCF, some information such as phase and dosage will be discarded.
PLINK has a variety of output files depending on the analysis. PLINK has the ability to output files for BEAGLE and can recode a {{Mono|.bed}} file into a VCF for analyses in other programs. Additionally, PLINK is designed to work in conjunction with R, and can output files to be processed by certain R packages.
Extensions and current developments
- PLINK 2.0 a comprehensive update to PLINK, developed by Christopher Chang, with the improved speed of various Genome-wide association (GWA) calculations, including identity-by-state (IBS) matrix calculation, LD-based pruning and association analysis.{{Cite journal|last1=Lee|first1=James J.|last2=Purcell|first2=Shaun M.|last3=Vattikuti|first3=Shashaank|last4=Tellier|first4=Laurent CAM|last5=Chow|first5=Carson C.|last6=Chang|first6=Christopher C.|date=2015-12-01|title=Second-generation PLINK: rising to the challenge of larger and richer datasets|journal=GigaScience|language=en|volume=4|issue=1|pages=7|doi=10.1186/s13742-015-0047-8|pmc=4342193|pmid=25722852 |doi-access=free }}
- PLINK/SEQ is an open-source C/C++ library designed for analyzing large scale whole-genome and whole-exome studies.
- MQFAM is a multivariate test of association (MQFAM) that can be efficiently applied to large population-based samples and is implemented in PLINK.
References
External links
- [http://zzz.bwh.harvard.edu/plink/ PLINK 1.07 homepage]
- [https://www.cog-genomics.org/plink PLINK 1.9 homepage]
{{free-software-stub}}
Category:Bioinformatics software