Whole genome sequencing#Analysis

{{Short description|Sequencing all the DNA of an individual at once}}

{{Redirect|Genome sequencing|the sequencing only of DNA|DNA sequencing}}

File:Chromatogram.jpgs are commonly used to sequence portions of genomes.{{cite book|last1=Alberts|first1=Bruce|last2=Johnson|first2=Alexander|last3=Lewis|first3=Julian|last4=Raff|first4=Martin|last5=Roberts|first5=Keith|last6=Walter|first6=Peter|title=Molecular biology of the cell|date=2008|publisher=Garland Science|location=New York|isbn=978-0-8153-4106-2|page=550|edition=5th|chapter=Chapter 8}}]]

File:Human karyotype with bands and sub-bands.png of a human, showing an overview of the human genome, with 22 homologous chromosomes, both the female (XX) and male (XY) versions of the sex chromosome (bottom right), as well as the mitochondrial genome (to scale at bottom left)]]

Whole genome sequencing (WGS), also known as full genome sequencing or just genome sequencing, is the process of determining the entirety of the DNA sequence of an organism's genome at a single time.{{Cite web|date=28 July 2021|title=What are whole exome sequencing and whole genome sequencing?|url=https://medlineplus.gov/genetics/understanding/testing/sequencing/#:~:text=Another%20method%2C%20called%20whole%20genome%20sequencing%2C%20determines%20the%20order%20of%20all%20the%20nucleotides%20in%20an%20individual's%20DNA%20and%20can%20determine%20variations%20in%20any%20part%20of%20the%20genome.|website=MedlinePlus|publisher=United States National Library of Medicine|access-date=25 April 2025|quote=Another method, called whole genome sequencing, determines the order of all the nucleotides in an individual's DNA and can determine variations in any part of the genome.}}{{Cite web|date=30 July 2024|title=Whole genome sequencing|url=https://www.hca.wa.gov/about-hca/programs-and-initiatives/health-technology-assessment/whole-genome-sequencing#:~:text=Whole%20genome%20sequencing%20(WGS;%20also%20called%20genome%20sequencing%20or%20full%20genome%20sequencing)%20is%20a%20laboratory%20procedure%20for%20determining%20an%20organism%E2%80%99s%20entire%20DNA%20sequence%20in%20one%20procedure.|website=Washington State Health Care Authority|publisher=Washington State Department of Health|access-date=25 April 2025|quote=Whole genome sequencing (WGS; also called genome sequencing or full genome sequencing) is a laboratory procedure for determining an organism’s entire DNA sequence in one procedure.}}{{Cite web|date=April 2024|title=Whole Genome Sequencing Criteria for Prior Authorization|url=https://www.hhs.texas.gov/sites/default/files/documents/whole-genome-sequencing-criteria-for-prior-authorization-draft-public-comment.pdf#page=2|publisher=Texas Health and Human Services Commission|access-date=25 April 2025|quote-page=2|quote=Whole Genome Sequencing (WGS) describes the sequencing of the entire human genome, including protein-coding regions (exons) and noncoding regions.}} This entails sequencing all of an organism's chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast.

Whole genome sequencing has largely been used as a research tool, but was being introduced to clinics in 2014.{{cite journal|last1=Gilissen|title=Genome sequencing identifies major causes of severe intellectual disability|journal=Nature|pmid=24896178|doi=10.1038/nature13394|volume=511|issue=7509|date=July 2014|pages=344–7|bibcode=2014Natur.511..344G|hdl=2066/138095 |s2cid=205238886|hdl-access=free}}{{cite journal|last1=Nones|first1=K|last2=Waddell|first2=N|last3=Wayte|first3=N|last4=Patch|first4=AM|last5=Bailey|first5=P|last6=Newell|first6=F|last7=Holmes|first7=O|last8=Fink|first8=JL|last9=Quinn|first9=MC|last10=Tang|first10=YH|last11=Lampe|first11=G|last12=Quek|first12=K|last13=Loffler|first13=KA|last14=Manning|first14=S|last15=Idrisoglu|first15=S|last16=Miller|first16=D|last17=Xu|first17=Q|last18=Waddell|first18=N|last19=Wilson|first19=PJ|last20=Bruxner|first20=TJ|last21=Christ|first21=AN|last22=Harliwong|first22=I|last23=Nourse|first23=C|last24=Nourbakhsh|first24=E|last25=Anderson|first25=M|last26=Kazakoff|first26=S|last27=Leonard|first27=C|last28=Wood|first28=S|last29=Simpson|first29=PT|last30=Reid|first30=LE|last31=Krause|first31=L|last32=Hussey|first32=DJ|last33=Watson|first33=DI|last34=Lord|first34=RV|last35=Nancarrow|first35=D|last36=Phillips|first36=WA|last37=Gotley|first37=D|last38=Smithers|first38=BM|last39=Whiteman|first39=DC|last40=Hayward|first40=NK|last41=Campbell|first41=PJ|last42=Pearson|first42=JV|last43=Grimmond|first43=SM|last44=Barbour|first44=AP|title=Genomic catastrophes frequently arise in esophageal adenocarcinoma and drive tumorigenesis|journal=Nature Communications|date=29 October 2014|volume=5|page=5224|pmid=25351503|doi=10.1038/ncomms6224|pmc=4596003|bibcode=2014NatCo...5.5224N}}{{cite journal|last1=van El|first1=CG|last2=Cornel|first2=MC|last3=Borry|first3=P|last4=Hastings|first4=RJ|last5=Fellmann|first5=F|last6=Hodgson|first6=SV|last7=Howard|first7=HC|last8=Cambon-Thomsen|first8=A|last9=Knoppers|first9=BM|last10=Meijers-Heijboer|first10=H|last11=Scheffer|first11=H|last12=Tranebjaerg|first12=L|last13=Dondorp|first13=W|last14=de Wert|first14=GM|title=Whole-genome sequencing in health care. Recommendations of the European Society of Human Genetics|journal=European Journal of Human Genetics|date=June 2013|volume=21|issue=Suppl 1 |pages=S1–5|pmid=23819146|doi=10.1038/ejhg.2013.46|pmc=3660957}} In the future of personalized medicine, whole genome sequence data may be an important tool to guide therapeutic intervention.{{cite journal|last1=Mooney|first1=Sean|title=Progress towards the integration of pharmacogenomics in practice|journal=Human Genetics|pmid=25238897|doi=10.1007/s00439-014-1484-7|date=Sep 2014|volume=134|issue=5|pmc=4362928|pages=459–65}} The tool of gene sequencing at SNP level is also used to pinpoint functional variants from association studies and improve the knowledge available to researchers interested in evolutionary biology, and hence may lay the foundation for predicting disease susceptibility and drug response.

Whole genome sequencing should not be confused with DNA profiling, which only determines the likelihood that genetic material came from a particular individual or group, and does not contain additional information on genetic relationships, origin or susceptibility to specific diseases.Kijk magazine, 01 January 2009 In addition, whole genome sequencing should not be confused with methods that sequence specific subsets of the genome – such methods include whole exome sequencing (1–2% of the genome) or SNP genotyping (< 0.1% of the genome).

History

File:Haemophilus influenzae 01.jpg.]]

File:C. elegans.jpg was the first animal to have its whole genome sequenced.]]

File:Drosophila melanogaster - front (aka).jpg{{'s}} whole genome was sequenced in 2000.]]

File:Arabidopsis thaliana inflorescencias.jpg was the first plant genome sequenced.]]

File:54986main mouse med.jpg was published in 2002.]]

File:Elaeis guineensis MS 3467.jpg (oil palm). This genome was particularly difficult to sequence because it had many repeated sequences which are difficult to organise.{{cite journal|last1=Marx|first1=Vivien|title=Next-generation sequencing: The genome jigsaw|journal=Nature|date=11 September 2013|volume=501|issue=7466|pages=263–268|doi=10.1038/501261a|pmid=24025842|bibcode=2013Natur.501..263M|doi-access=free}}]]

The DNA sequencing methods used in the 1970s and 1980s were manual; for example, Maxam–Gilbert sequencing and Sanger sequencing. Several whole bacteriophage and animal viral genomes were sequenced by these techniques, but the shift to more rapid, automated sequencing methods in the 1990s facilitated the sequencing of the larger bacterial and eukaryotic genomes.{{cite book|last1=al.]|first1=Bruce Alberts ... [et|title=Molecular biology of the cell|date=2008|publisher=Garland Science|location=New York|isbn=978-0-8153-4106-2|page=551|edition=5th}}

The first virus to have its complete genome sequenced was the Bacteriophage MS2 by 1976.{{cite journal |last1=Fiers |first1=W. |last2=Contreras |first2=R. |last3=Duerinck |first3=F. |last4=Haegeman |first4=G. |last5=Iserentant |first5=D. |last6=Merregaert |first6=J. |last7=Min Jou |first7=W. |last8=Molemans |first8=F. |last9=Raeymaekers |first9=A. |last10=Van den Berghe |first10=A. |last11=Volckaert |first11=G. |last12=Ysebaert |first12=M. |title=Complete nucleotide sequence of bacteriophage MS2 RNA: primary and secondary structure of the replicase gene |journal=Nature |date=8 April 1976 |volume=260 |issue=5551 |pages=500–507 |doi=10.1038/260500a0 |pmid=1264203 |bibcode=1976Natur.260..500F |s2cid=4289674 }} In 1992, yeast chromosome III was the first chromosome of any organism to be fully sequenced.{{cite journal |last1=Oliver |first1=S. G. |last2=van der Aart |first2=Q. J. M. |last3=Agostoni-Carbone |first3=M. L. |last4=Aigle |first4=M. |last5=Alberghina |first5=L. |last6=Alexandraki |first6=D. |last7=Antoine |first7=G. |last8=Anwar |first8=R. |last9=Ballesta |first9=J. P. G. |last10=Benit |first10=P. |last11=Berben |first11=G. |last12=Bergantino |first12=E. |last13=Biteau |first13=N. |last14=Bolle |first14=P. A. |last15=Bolotin-Fukuhara |first15=M. |last16=Brown |first16=A. |last17=Brown |first17=A. J. P. |last18=Buhler |first18=J. M. |last19=Carcano |first19=C. |last20=Carignani |first20=G. |last21=Cederberg |first21=H. |last22=Chanet |first22=R. |last23=Contreras |first23=R. |last24=Crouzet |first24=M. |last25=Daignan-Fornier |first25=B. |last26=Defoor |first26=E. |last27=Delgado |first27=M. |last28=Demolder |first28=J. |last29=Doira |first29=C. |last30=Dubois |first30=E. |last31=Dujon |first31=B. |last32=Dusterhoft |first32=A. |last33=Erdmann |first33=D. |last34=Esteban |first34=M. |last35=Fabre |first35=F. |last36=Fairhead |first36=C. |last37=Faye |first37=G. |last38=Feldmann |first38=H. |last39=Fiers |first39=W. |last40=Francingues-Gaillard |first40=M. C. |last41=Franco |first41=L. |last42=Frontali |first42=L. |last43=Fukuhara |first43=H. |last44=Fuller |first44=L. J. |last45=Galland |first45=P. |last46=Gent |first46=M. E. |last47=Gigot |first47=D. |last48=Gilliquet |first48=V. |last49=Glansdorff |first49=N. |last50=Goffeau |first50=A. |last51=Grenson |first51=M. |last52=Grisanti |first52=P. |last53=Grivell |first53=L. A. |last54=de Haan |first54=M. |last55=Haasemann |first55=M. |last56=Hatat |first56=D. |last57=Hoenicka |first57=J. |last58=Hegemann |first58=J. |last59=Herbert |first59=C. J. |last60=Hilger |first60=F. |last61=Hohmann |first61=S. |last62=Hollenberg |first62=C. P. |last63=Huse |first63=K. |last64=Iborra |first64=F. |last65=Indje |first65=K. J. |last66=Isono |first66=K. |last67=Jacq |first67=C. |last68=Jacquet |first68=M. |last69=James |first69=C. M. |last70=Jauniaux |first70=J. C. |last71=Jia |first71=Y. |last72=Jimenez |first72=A. |last73=Kelly |first73=A. |last74=Kleinhans |first74=U. |last75=Kreisl |first75=P |last76=Lanfranchi |first76=G |last77=Lewis |first77=C |last78=vanderLinden |first78=C. G. |last79=Lucchini |first79=G |last80=Lutzenkirchen |first80=K |last81=Maat |first81=M.J. |last82=Mallet |first82=L. |last83=Mannhaupet |first83=G. |last84=Martegani |first84=E. |last85=Mathieu |first85=A. |last86=Maurer |first86=C. T. C. |last87=McConnell |first87=D. |last88=McKee |first88=R. A. |last89=Messenguy |first89=F. |last90=Mewes |first90=H. W. |last91=Molemans |first91=F. |last92=Montague |first92=M. A. |last93=Muzi Falconi |first93=M. |last94=Navas |first94=L. |last95=Newlon |first95=C. S. |last96=Noone |first96=D. |last97=Pallier |first97=C. |last98=Panzeri |first98=L. |last99=Pearson |first99=B. M. |last100=Perea |first100=J. |last101=Philippsen |first101=P. |last102=Pierard |first102=A. |last103=Planta |first103=R. J. |last104=Plevani |first104=P. |last105=Poetsch |first105=B. |last106=Pohl |first106=F. |last107=Purnelle |first107=B. |last108=Ramezani Rad |first108=M. |last109=Rasmussen |first109=S. W. |last110=Raynal |first110=A. |last111=Remacha |first111=M. |last112=Richterich |first112=P. |last113=Roberts |first113=A. B. |last114=Rodriguez |first114=F. |last115=Sanz |first115=E. |last116=Schaaff-Gerstenschlager |first116=I. |last117=Scherens |first117=B. |last118=Schweitzer |first118=B. |last119=Shu |first119=Y. |last120=Skala |first120=J. |last121=Slonimski |first121=P. P. |last122=Sor |first122=F. |last123=Soustelle |first123=C. |last124=Spiegelberg |first124=R. |last125=Stateva |first125=L. I. |last126=Steensma |first126=H. Y. |last127=Steiner |first127=S. |last128=Thierry |first128=A. |last129=Thireos |first129=G. |last130=Tzermia |first130=M. |last131=Urrestarazu |first131=L. A. |last132=Valle |first132=G. |last133=Vetter |first133=I. |last134=van Vliet-Reedijk |first134=J. C. |last135=Voet |first135=M. |last136=Volckaert |first136=G. |last137=Vreken |first137=P. |last138=Wang |first138=H. |last139=Warmington |first139=J. R. |last140=von Wettstein |first140=D. |last141=Wicksteed |first141=B. L. |last142=Wilson |first142=C. |last143=Wurst |first143=H. |last144=Xu |first144=G. |last145=Yoshikawa |first145=A. |last146=Zimmermann |first146=F. K. |last147=Sgouros |first147=J. G. |display-authors=3 |title=The complete DNA sequence of yeast chromosome III |journal=Nature |date=May 1992 |volume=357 |issue=6373 |pages=38–46 |doi=10.1038/357038a0 |pmid=1574125 |bibcode=1992Natur.357...38O |s2cid=4271784 }} The first organism whose entire genome was fully sequenced was Haemophilus influenzae in 1995.{{cite journal|last1=Fleischmann|first1=R.|last2=Adams|first2=M.|last3=White|first3=O|last4=Clayton|first4=R.|last5=Kirkness|first5=E.|last6=Kerlavage|first6=A.|last7=Bult|first7=C.|last8=Tomb|first8=J.|last9=Dougherty|first9=B.|last10=Merrick|first10=J.|last11=al.|first11=e.|title=Whole-genome random sequencing and assembly of Haemophilus influenzae Rd|journal=Science|date=28 July 1995|volume=269|issue=5223|pages=496–512|doi=10.1126/science.7542800|pmid=7542800|bibcode=1995Sci...269..496F}} After it, the genomes of other bacteria and some archaea were first sequenced, largely due to their small genome size. H. influenzae has a genome of 1,830,140 base pairs of DNA. In contrast, eukaryotes, both unicellular and multicellular such as Amoeba dubia and humans (Homo sapiens) respectively, have much larger genomes (see C-value paradox).{{cite journal|last1=Eddy|first1=Sean R.|title=The C-value paradox, junk DNA and ENCODE|journal=Current Biology|date=November 2012|volume=22|issue=21|pages=R898–R899|doi=10.1016/j.cub.2012.10.002|pmid=23137679|doi-access=free|bibcode=2012CBio...22.R898E }} Amoeba dubia has a genome of 700 billion nucleotide pairs spread across thousands of chromosomes.{{cite journal|last1=Pellicer|first1=Jaume|last2=FAY|first2=Michael F.|last3=Leitch|first3=Ilia J.|title=The largest eukaryotic genome of them all?|journal=Botanical Journal of the Linnean Society|date=15 September 2010|volume=164|issue=1|pages=10–15|doi=10.1111/j.1095-8339.2010.01072.x|doi-access=}} Humans contain fewer nucleotide pairs (about 3.2 billion in each germ cell – note the exact size of the human genome is still being revised) than A. dubia, however, their genome size far outweighs the genome size of individual bacteria.{{cite journal|last1=Human Genome Sequencing Consortium|first1=International|title=Finishing the euchromatic sequence of the human genome|journal=Nature|date=21 October 2004|volume=431|issue=7011|pages=931–945|doi=10.1038/nature03001|pmid=15496913|bibcode=2004Natur.431..931H|doi-access=free}}

The first bacterial and archaeal genomes, including that of H. influenzae, were sequenced by Shotgun sequencing. In 1996, the first eukaryotic genome (Saccharomyces cerevisiae) was sequenced. S. cerevisiae, a model organism in biology has a genome of only around 12 million nucleotide pairs,{{cite journal|last1=Goffeau|first1=A.|last2=Barrell|first2=B. G.|last3=Bussey|first3=H.|last4=Davis|first4=R. W.|last5=Dujon|first5=B.|last6=Feldmann|first6=H.|last7=Galibert|first7=F.|last8=Hoheisel|first8=J. D.|last9=Jacq|first9=C.|last10=Johnston|first10=M.|last11=Louis|first11=E. J.|last12=Mewes|first12=H. W.|last13=Murakami|first13=Y.|last14=Philippsen|first14=P.|last15=Tettelin|first15=H.|last16=Oliver|first16=S. G.|title=Life with 6000 Genes|journal=Science|date=25 October 1996|volume=274|issue=5287|pages=546–567|doi=10.1126/science.274.5287.546|url=https://www.researchgate.net/publication/14356864|pmid=8849441|url-status=live|archive-url=https://web.archive.org/web/20160307151033/https://www.researchgate.net/profile/Edward_Louis/publication/14356864_Life_with_6000_genes/links/0912f50caf85c9a96e000000.pdf|archive-date=7 March 2016|bibcode=1996Sci...274..546G|s2cid=16763139}} and was the first unicellular eukaryote to have its whole genome sequenced. The first multicellular eukaryote, and animal, to have its whole genome sequenced was the nematode worm: Caenorhabditis elegans in 1998.{{cite journal|author=The C. elegans Sequencing Consortium|title=Genome Sequence of the Nematode C. elegans: A Platform for Investigating Biology|journal=Science|date=11 December 1998|volume=282|issue=5396|pages=2012–2018|doi=10.1126/science.282.5396.2012|pmid=9851916|bibcode=1998Sci...282.2012.}} Eukaryotic genomes are sequenced by several methods including Shotgun sequencing of short DNA fragments and sequencing of larger DNA clones from DNA libraries such as bacterial artificial chromosomes (BACs) and yeast artificial chromosomes (YACs).{{cite book |first1=Bruce |last1=Alberts |title= Molecular Biology of the Cell |date=2008 |publisher= Garland Science |location=New York|isbn=978-0-8153-4106-2|page=552|edition=5th}}

In 1999, the entire DNA sequence of human chromosome 22, the second shortest human autosome, was published.{{cite journal|last1=Dunham|first1=I.|title=The DNA sequence of human chromosome 22|volume=402|issue=6761|doi=10.1038/990031|journal=Nature|pages=489–495|pmid=10591208|date=December 1999|bibcode=1999Natur.402..489D|doi-access=free}} By the year 2000, the second animal and second invertebrate (yet first insect) genome was sequenced – that of the fruit fly Drosophila melanogaster – a popular choice of model organism in experimental research.{{cite journal|title=The Genome Sequence of Drosophila melanogaster|journal=Science|date=2000-03-24|volume=287|issue=5461|pages=2185–2195|doi=10.1126/science.287.5461.2185|pmid=10731132|author=Adams MD|author2=Celniker SE|author3=Holt RA|display-authors=etal|bibcode=2000Sci...287.2185.|citeseerx=10.1.1.549.8639}} The first plant genome – that of the model organism Arabidopsis thaliana – was also fully sequenced by 2000.{{cite journal|title=Analysis of the genome sequence of the flowering plant Arabidopsis thaliana|journal=Nature|date=2000-12-14|pages=796–815|pmid=11130711|doi=10.1038/35048692|volume=408|issue=6814|bibcode=2000Natur.408..796T|last1=The Arabidopsis Genome Initiative|doi-access=free}} By 2001, a draft of the entire human genome sequence was published.{{cite journal|title=The Sequence of the Human Genome|journal=Science|date=2001-02-16|volume=291|issue=5507|pages=1304–1351|doi=10.1126/science.1058040|pmid=11181995|author=Venter JC|author2= Adams MD|author3=Myers EW|display-authors=etal|bibcode=2001Sci...291.1304V|doi-access=}} The genome of the laboratory mouse Mus musculus was completed in 2002.{{cite journal|title=Initial sequencing and comparative analysis of the mouse genome|journal=Nature|date=2002-10-31|volume=420|issue=6915|doi=10.1038/nature01262|pages=520–562|pmid=12466850|author=Waterston RH|author2=Lindblad-Toh K|author3=Birney E|display-authors=etal|bibcode=2002Natur.420..520W|doi-access=free}}

In 2004, the Human Genome Project published an incomplete version of the human genome.{{cite journal|date=2004-09-07|title=Finishing the euchromatic sequence of the human genome|journal=Nature|volume=431|issue=7011|pages=931–945|doi=10.1038/nature03001|pmid=15496913|bibcode=2004Natur.431..931H|author1=International Human Genome Sequencing Consortium|doi-access=free}} In 2008, a group from Leiden, the Netherlands, reported the sequencing of the first female human genome (Marjolein Kriek).

Currently thousands of genomes have been wholly or partially sequenced.

Experimental details

= Cells used for sequencing =

Almost any biological sample containing a full copy of the DNA—even a very small amount of DNA or ancient DNA—can provide the genetic material necessary for full genome sequencing. Such samples may include saliva, epithelial cells, bone marrow, hair (as long as the hair contains a hair follicle), seeds, plant leaves, or anything else that has DNA-containing cells.

The genome sequence of a single cell selected from a mixed population of cells can be determined using techniques of single cell genome sequencing. This has important advantages in environmental microbiology in cases where a single cell of a particular microorganism species can be isolated from a mixed population by microscopy on the basis of its morphological or other distinguishing characteristics. In such cases the normally necessary steps of isolation and growth of the organism in culture may be omitted, thus allowing the sequencing of a much greater spectrum of organism genomes.{{cite journal|last=Braslavsky|first=Ido|title=Sequence information can be obtained from single DNA molecules|journal=Proc Natl Acad Sci USA|date=2003|volume=100|issue=7|pages=3960–3984|doi=10.1073/pnas.0230489100|pmid=12651960|pmc=153030|display-authors=etal|bibcode=2003PNAS..100.3960B|doi-access=free}}

Single cell genome sequencing is being tested as a method of preimplantation genetic diagnosis, wherein a cell from the embryo created by in vitro fertilization is taken and analyzed before embryo transfer into the uterus.{{cite magazine |url=http://www.genomeweb.com/sequencing/single-cell-sequencing-makes-strides-clinic-cancer-and-pgd-first-applications |url-access=subscription |title=Single-cell Sequencing Makes Strides in the Clinic with Cancer and PGD First Applications |magazine=Clinical Sequencing News |first=Monica |last=Heger |date=October 2, 2013}} After implantation, cell-free fetal DNA can be taken by simple venipuncture from the mother and used for whole genome sequencing of the fetus.{{Cite journal | pmid = 24428465| date = 2014| last1 = Yurkiewicz| first1 = I. R.| title = Prenatal whole-genome sequencing--is the quest to know a fetus's future ethical?| journal = New England Journal of Medicine| volume = 370| issue = 3| pages = 195–7| last2 = Korf| first2 = B. R.| last3 = Lehmann| first3 = L. S.| doi = 10.1056/NEJMp1215536}}

= Early techniques =

File:ABI PRISM 3100 Genetic Analyzer 3.jpg

Sequencing of nearly an entire human genome was first accomplished in 2000 partly through the use of shotgun sequencing technology. While full genome shotgun sequencing for small (4000–7000 base pair) genomes was already in use in 1979,{{cite journal | author = Staden R | title = A strategy of DNA sequencing employing computer programs | journal = Nucleic Acids Res. | volume = 6 | issue = 7 | pages = 2601–10 |date=June 1979 | pmid = 461197 | pmc = 327874 | doi = 10.1093/nar/6.7.2601| url = }} broader application benefited from pairwise end sequencing, known colloquially as double-barrel shotgun sequencing. As sequencing projects began to take on longer and more complicated genomes, multiple groups began to realize that useful information could be obtained by sequencing both ends of a fragment of DNA. Although sequencing both ends of the same fragment and keeping track of the paired data was more cumbersome than sequencing a single end of two distinct fragments, the knowledge that the two sequences were oriented in opposite directions and were about the length of a fragment apart from each other was valuable in reconstructing the sequence of the original target fragment.

The first published description of the use of paired ends was in 1990 as part of the sequencing of the human HPRT locus,{{cite journal | last = Edwards | first = A |author2=Caskey, T | title = Closure strategies for random DNA sequencing | journal = Methods: A Companion to Methods in Enzymology | volume = 3 | issue = 1 | pages = 41–47 | date = 1991| doi = 10.1016/S1046-2023(05)80162-8 }} although the use of paired ends was limited to closing gaps after the application of a traditional shotgun sequencing approach. The first theoretical description of a pure pairwise end sequencing strategy, assuming fragments of constant length, was in 1991.{{cite journal | author = Edwards A | author2 = Voss H | author3 = Rice P| author4 = Civitello A | author5 = Stegemann J| author6 = Schwager C | author7 = Zimmermann J| author8 = Erfle H | author9 = Caskey CT| author10 = Ansorge W | title = Automated DNA sequencing of the human HPRT locus | journal = Genomics | volume = 6 | issue = 4 | pages = 593–608 |date=April 1990 | pmid = 2341149 | doi = 10.1016/0888-7543(90)90493-E}} In 1995, the innovation of using fragments of varying sizes was introduced,{{cite journal | author = Roach JC | author2 = Boysen C | author3 = Wang K| author4 = Hood L | title = Pairwise end sequencing: a unified approach to genomic mapping and sequencing | journal = Genomics | volume = 26 | issue = 2 | pages = 345–53 |date=March 1995 | pmid = 7601461 | doi = 10.1016/0888-7543(95)80219-C }} and demonstrated that a pure pairwise end-sequencing strategy would be possible on large targets. The strategy was subsequently adopted by The Institute for Genomic Research (TIGR) to sequence the entire genome of the bacterium Haemophilus influenzae in 1995,{{cite journal | author = Fleischmann RD | author2 = Adams MD | author3 = White O| author4 = Clayton RA | author5 = Kirkness EF| author6 = Kerlavage AR | author7 = Bult CJ| author8 = Tomb JF | author9 = Dougherty BA| author10 = Merrick JM | title = Whole-genome random sequencing and assembly of Haemophilus influenzae Rd | journal = Science | volume = 269 | issue = 5223 | pages = 496–512 |date=July 1995 | pmid = 7542800 | doi = 10.1126/science.7542800 |bibcode = 1995Sci...269..496F | last11 = McKenney | last12 = Sutton | last13 = Fitzhugh | last14 = Fields | last15 = Gocyne | last16 = Scott | last17 = Shirley | last18 = Liu | last19 = Glodek | last20 = Kelley | last21 = Weidman | last22 = Phillips | last23 = Spriggs | last24 = Hedblom | last25 = Cotton | last26 = Utterback | last27 = Hanna | last28 = Nguyen | last29 = Saudek | last30 = Brandon | display-authors = 29 }} and then by Celera Genomics to sequence the entire fruit fly genome in 2000,{{cite journal | last = Adams | first = MD | title = The genome sequence of Drosophila melanogaster | journal = Science | volume = 287 | issue = 5461 | pages = 2185–95 | date = 2000| pmid = 10731132 | doi = 10.1126/science.287.5461.2185 | bibcode=2000Sci...287.2185.|display-authors=etal| citeseerx = 10.1.1.549.8639 }} and subsequently the entire human genome. Applied Biosystems, now called Life Technologies, manufactured the automated capillary sequencers utilized by both Celera Genomics and The Human Genome Project.

= Current techniques =

While capillary sequencing was the first approach to successfully sequence a nearly full human genome, it is still too expensive and takes too long for commercial purposes. Since 2005, capillary sequencing has been progressively displaced by high-throughput (formerly "next-generation") sequencing technologies such as Illumina dye sequencing, pyrosequencing, and SMRT sequencing.{{cite journal | author = Mukhopadhyay R | title = DNA sequencers: the next generation | journal = Anal. Chem. | volume = 81| issue = 5| pages = 1736–40|date=February 2009 | pmid = 19193124 | doi = 10.1021/ac802712u }} All of these technologies continue to employ the basic shotgun strategy, namely, parallelization and template generation via genome fragmentation.

Other technologies have emerged, including Nanopore technology. Though the sequencing accuracy of Nanopore technology is lower than those above, its read length is on average much longer.{{Cite journal |last1=Sevim |first1=Volkan |last2=Lee |first2=Juna |last3=Egan |first3=Robert |last4=Clum |first4=Alicia |last5=Hundley |first5=Hope |last6=Lee |first6=Janey |last7=Everroad |first7=R. Craig |last8=Detweiler |first8=Angela M. |last9=Bebout |first9=Brad M. |last10=Pett-Ridge |first10=Jennifer |last11=Göker |first11=Markus |last12=Murray |first12=Alison E. |last13=Lindemann |first13=Stephen R. |last14=Klenk |first14=Hans-Peter |last15=O'Malley |first15=Ronan |date=2019-11-26 |title=Shotgun metagenome data of a defined mock community using Oxford Nanopore, PacBio and Illumina technologies |journal=Scientific Data |language=en |volume=6 |issue=1 |pages=285 |doi=10.1038/s41597-019-0287-z |pmid=31772173 |pmc=6879543 |bibcode=2019NatSD...6..285S |issn=2052-4463}} This generation of long reads is valuable especially in de novo whole-genome sequencing applications.{{Cite journal |last1=Wang |first1=Yunhao |last2=Zhao |first2=Yue |last3=Bollas |first3=Audrey |last4=Wang |first4=Yuru |last5=Au |first5=Kin Fai |date=November 2021 |title=Nanopore sequencing technology, bioinformatics and applications |journal=Nature Biotechnology |language=en |volume=39 |issue=11 |pages=1348–1365 |doi=10.1038/s41587-021-01108-x |pmid=34750572 |pmc=8988251 |issn=1546-1696}}

= Analysis =

In principle, full genome sequencing can provide the raw nucleotide sequence of an individual organism's DNA at a single point in time. However, further analysis must be performed to provide the biological or medical meaning of this sequence, such as how this knowledge can be used to help prevent disease. Methods for analyzing sequencing data are being developed and refined.

Because sequencing generates a lot of data (for example, there are approximately six billion base pairs in each human diploid genome), its output is stored electronically and requires a large amount of computing power and storage capacity.

While analysis of WGS data can be slow, it is possible to speed up this step by using dedicated hardware.{{cite web |last=Strickland |first=Eliza |url=https://spectrum.ieee.org/tech-talk/biomedical/diagnostics/new-genetic-technologies-diagnose-critically-ill-infants-within-26-hours |title=New Genetic Technologies Diagnose Critically Ill Infants Within 26 Hours – IEEE Spectrum |website=IEEE |date=2015-10-14 |access-date=2016-11-11 |url-status=dead |archive-url=https://web.archive.org/web/20151116160203/https://spectrum.ieee.org/tech-talk/biomedical/diagnostics/new-genetic-technologies-diagnose-critically-ill-infants-within-26-hours |archive-date=2015-11-16 }}

Commercialization

File:Historic cost of sequencing a human genome.svg]]

A number of public and private companies are competing to develop a full genome sequencing platform that is commercially robust for both research and clinical use,{{cite web|url=http://www.genengnews.com/articles/chitem.aspx?aid=939&chid=1 |archive-url=https://web.archive.org/web/20061017014410/http://www.genengnews.com/articles/chitem.aspx?aid=939&chid=1 |url-status=dead |archive-date=2006-10-17 |title=Article : Race to Cut Whole Genome Sequencing Costs Genetic Engineering & Biotechnology News — Biotechnology from Bench to Business |publisher=Genengnews.com |access-date=2009-02-23}} including Illumina,{{cite web |url=http://www.eyeondna.com/2008/02/11/whole-genome-sequencing-costs-continue-to-drop/ |title=Whole Genome Sequencing Costs Continue to Drop |publisher=Eyeondna.com |access-date=2009-02-23 |url-status=live|archive-url=https://web.archive.org/web/20090325014530/http://www.eyeondna.com/2008/02/11/whole-genome-sequencing-costs-continue-to-drop/ |archive-date=2009-03-25 }} Knome,{{cite news |url=http://www.scientificamerican.com/article.cfm?id=personal-genome-sequencing&print=true |title=Genome Sequencing for the Rest of Us |publisher=Scientific American |date=2010-06-28 |access-date=2010-08-13 |first=Katherine |last=Harmon |url-status=live |archive-url=https://web.archive.org/web/20110319102356/http://www.scientificamerican.com/article.cfm?id=personal-genome-sequencing&print=true |archive-date=2011-03-19 }} Sequenom,{{cite web |author=San Diego/Orange County Technology News |url=http://www.freshnews.com/news/biotech-biomedical/article_39927.html |archive-url=https://web.archive.org/web/20081205063117/http://www.freshnews.com/news/biotech-biomedical/article_39927.html |url-status=dead |archive-date=2008-12-05 |title=Sequenom to Develop Third-Generation Nanopore-Based Single Molecule Sequencing Technology |publisher=Freshnews.com |access-date=2009-02-24 }}

454 Life Sciences,{{cite web|url=http://www.genengnews.com/articles/chitem.aspx?aid=658&chid=2 |archive-url=https://web.archive.org/web/20061017003652/http://www.genengnews.com/articles/chitem.aspx?aid=658&chid=2 |url-status=dead |archive-date=2006-10-17 |title=Article : Whole Genome Sequencing in 24 Hours Genetic Engineering & Biotechnology News — Biotechnology from Bench to Business |publisher=Genengnews.com |access-date=2009-02-23}} Pacific Biosciences,{{cite web |url=https://venturebeat.com/2008/02/10/pacific-bio-lifts-the-veil-on-its-high-speed-genome-sequencing-effort/ |title=Pacific Bio lifts the veil on its high-speed genome-sequencing effort |publisher=VentureBeat |access-date=2009-02-23 |url-status=live |archive-url=https://web.archive.org/web/20090220050530/http://venturebeat.com/2008/02/10/pacific-bio-lifts-the-veil-on-its-high-speed-genome-sequencing-effort/ |archive-date=2009-02-20 |date=2008-02-10 }} Complete Genomics,{{cite web |url=http://www.bio-itworld.com/headlines/2008/oct06/complete-genomics-dna-nanoballs.html |title=Bio-IT World |publisher=Bio-IT World |date=2008-10-06 |access-date=2009-02-23 |url-status=live |archive-url=https://web.archive.org/web/20090217054635/http://www.bio-itworld.com/headlines/2008/oct06/complete-genomics-dna-nanoballs.html |archive-date=2009-02-17 }}

Helicos Biosciences,{{cite web |url=http://www.xconomy.com/boston/2008/04/22/with-new-machine-helicos-brings-personal-genome-sequencing-a-step-closer/ |title=With New Machine, Helicos Brings Personal Genome Sequencing A Step Closer |publisher=Xconomy |date=2008-04-22 |access-date=2011-01-28 |url-status=live |archive-url=https://web.archive.org/web/20110102080728/http://www.xconomy.com/boston/2008/04/22/with-new-machine-helicos-brings-personal-genome-sequencing-a-step-closer/ |archive-date=2011-01-02 }} GE Global Research (General Electric), Affymetrix, IBM, Intelligent Bio-Systems,{{cite web |url=http://nextbigfuture.com/2008/03/genome-sequencing-costs-continue-to.html |title=Whole genome sequencing costs continue to fall: $300 million in 2003, $1 million 2007, $60,000 now, $5000 by year end |publisher=Nextbigfuture.com |date=2008-03-25 |access-date=2011-01-28 |url-status=live |archive-url=https://web.archive.org/web/20101220100859/http://nextbigfuture.com/2008/03/genome-sequencing-costs-continue-to.html |archive-date=2010-12-20 }} Life Technologies, Oxford Nanopore Technologies,{{cite web|url=http://www.technologyreview.com/biomedicine/22112// |archive-url=https://wayback.archive-it.org/all/20110329020237/http://www.technologyreview.com/biomedicine/22112/ |url-status=dead |archive-date=2011-03-29 |title=Han Cao's nanofluidic chip could cut DNA sequencing costs dramatically |publisher=Technology Review}} and the Beijing Genomics Institute.{{cite web|url=https://www.genomeweb.com/sequencing-technology/bgi-launches-desktop-sequencer-china-plans-register-platform-cfda|title=BGI Launches Desktop Sequencer in China; Plans to Register Platform With CFDA|author=Julia Karow|date=26 October 2015|publisher=GenomeWeb|access-date=2 December 2018}}{{cite web|url=https://www.360dx.com/sequencing/bgi-launches-new-desktop-sequencer-china-registers-larger-version-cfda?trendmd-shared=0|title=BGI Launches New Desktop Sequencer in China, Registers Larger Version With CFDA|date=11 November 2016|work=360Dx|publisher=GenomeWeb|access-date=2 December 2018}}{{cite web|url=https://www.genomeweb.com/sequencing/bgi-launches-new-sequencer-customers-report-data-earlier-instruments|title=BGI Launches New Sequencer as Customers Report Data From Earlier Instruments|author=Monica Heger|date=26 October 2018|publisher=GenomeWeb|access-date=2 December 2018}} These companies are heavily financed and backed by venture capitalists, hedge funds, and investment banks.{{cite web |date=2008-07-14 |author=John Carroll |url=http://www.fiercebiotech.com/story/pacific-biosciences-garners-100m-sequencing-tech/2008-07-14 |title=Pacific Biosciences gains $100M for sequencing tech |publisher=FierceBiotech |access-date=2009-02-23 |url-status=live |archive-url=https://web.archive.org/web/20090501083957/http://www.fiercebiotech.com/story/pacific-biosciences-garners-100m-sequencing-tech/2008-07-14 |archive-date=2009-05-01 }}{{cite news|url=http://sanjose.bizjournals.com/sanjose/stories/2009/02/09/story1.html?b=1234155600^1773923 |title=Complete Genomics brings radical reduction in cost |work= Silicon Valley / San Jose Business Journal |date= 2009-02-08|access-date=2009-02-23 |first=Lisa |last=Sibley}}

A commonly-referenced commercial target for sequencing cost until the late 2010s was $1,000{{Nbsp}}USD, however, the private companies are working to reach a new target of only $100.{{cite news|url=https://www.ft.com/content/017a3a50-f6f1-11e7-a4c9-bbdefa4f210b|title=Cheaper DNA sequencing unlocks secrets of rare diseases|author=Sarah Neville|date=5 March 2018|publisher=Financial Times|access-date=2 December 2018}}

= Incentive =

In October 2006, the X Prize Foundation, working in collaboration with the J. Craig Venter Science Foundation, established the Archon X Prize for Genomics,{{cite web |last=Carlson |first=Rob |url=http://synthesis.cc/2007/01/a-few-thoughts-on-rapid-genome-sequencing-and-the-archon-prize.html |title=A Few Thoughts on Rapid Genome Sequencing and The Archon Prize — synthesis |publisher=Synthesis.cc |date=2007-01-02 |access-date=2009-02-23 |url-status=live |archive-url=https://web.archive.org/web/20090808010556/http://synthesis.cc/2007/01/a-few-thoughts-on-rapid-genome-sequencing-and-the-archon-prize.html |archive-date=2009-08-08 }} intending to award $10 million to "the first team that can build a device and use it to sequence 100 human genomes within 10 days or less, with an accuracy of no more than one error in every 1,000,000 bases sequenced, with sequences accurately covering at least 98% of the genome, and at a recurring cost of no more than $1,000 per genome".[https://web.archive.org/web/20080421165009/http://genomics.xprize.org/genomics/archon-x-prize-for-genomics/prize-overview "PRIZE Overview: Archon X PRIZE for Genomics"]. The Archon X Prize for Genomics was cancelled in 2013, before its official start date.{{cite news|url= http://www.huffingtonpost.com/peter-diamandis/outpaced-by-innovation-ca_b_3795710.html|title= Outpaced by Innovation: Canceling an XPRIZE|work= Huffington Post|first= Peter|last= Diamandis|url-status= live|archive-url= https://web.archive.org/web/20130825060941/http://www.huffingtonpost.com/peter-diamandis/outpaced-by-innovation-ca_b_3795710.html|archive-date= 2013-08-25}}{{cite web|last1=Aldhous|first1=Peter|title=X Prize for genomes cancelled before it begins|url=https://www.newscientist.com/article/dn24105-x-prize-for-genomes-cancelled-before-it-begins/|url-status=live|archive-url=https://web.archive.org/web/20160921040227/https://www.newscientist.com/article/dn24105-x-prize-for-genomes-cancelled-before-it-begins/|archive-date=2016-09-21}}

= History =

In 2007, Applied Biosystems started selling a new type of sequencer called SOLiD System.{{cite web |url=http://www.gizmag.com/go/8248/ |title=SOLiD System — a next-gen DNA sequencing platform announced |publisher=Gizmag.com |date=2007-10-27 |access-date=2009-02-24 |url-status=live |archive-url=https://web.archive.org/web/20080719103516/http://www.gizmag.com/go/8248/ |archive-date=2008-07-19 }} The technology allowed users to sequence 60 gigabases per run.{{cite web |url=http://www.dddmag.com/article-The-1000-Genome-Coming-Soon-41310.aspx |title=The $1000 Genome: Coming Soon? |publisher=Dddmag.com |date=2010-04-01 |access-date=2011-01-28 |url-status=live |archive-url=https://web.archive.org/web/20110415094531/http://www.dddmag.com/article-The-1000-Genome-Coming-Soon-41310.aspx |archive-date=2011-04-15 }}

In June 2009, Illumina announced that they were launching their own Personal Full Genome Sequencing Service at a depth of 30× for $48,000 per genome.{{cite web |url=http://www.everygenome.com |title=Individual genome sequencing — Illumina, Inc. |publisher=Everygenome.com |access-date=2011-01-28 |url-status=dead |archive-url=https://web.archive.org/web/20111019160401/http://www.everygenome.com/ |archive-date=2011-10-19 }}{{cite web|url=http://scienceblogs.com/geneticfuture/2009/06/illumina_launches_personal_gen.php |title=Illumina launches personal genome sequencing service for $48,000 : Genetic Future |publisher=Scienceblogs.com |access-date=2011-01-28 |url-status=dead |archive-url=https://web.archive.org/web/20090616110221/http://scienceblogs.com/geneticfuture/2009/06/illumina_launches_personal_gen.php |archive-date=June 16, 2009 }} In August, the founder of Helicos Biosciences, Stephen Quake, stated that using the company's Single Molecule Sequencer he sequenced his own full genome for less than $50,000.{{cite news | url=https://www.nytimes.com/2009/08/11/science/11gene.html | work=The New York Times | title=Cost of Decoding a Genome Is Lowered | first=Nicholas | last=Wade | date=2009-08-11 | access-date=2010-05-03 | url-status=live | archive-url=https://web.archive.org/web/20130521160641/http://www.nytimes.com/2009/08/11/science/11gene.html?_r=1&hp | archive-date=2013-05-21 }} In November, Complete Genomics published a peer-reviewed paper in Science demonstrating its ability to sequence a complete human genome for $1,700.{{cite web|url=http://abcnews.go.com/Technology|title=Technology Index|website=ABC News|access-date=29 April 2018|url-status=bot: unknown|archive-url=http://arquivo.pt/wayback/20160515151839/http://abcnews.go.com/Technology|archive-date=15 May 2016}}{{cite journal |vauthors=Drmanac R, Sparks AB, Callow MJ, etal | year = 2010 | title = Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays | journal = Science | volume = 327 | issue = 5961| pages = 78–81 | bibcode = 2010Sci...327...78D | doi = 10.1126/science.1181498 | pmid = 19892942 | s2cid = 17309571 | doi-access = free }}

In May 2011, Illumina lowered its Full Genome Sequencing service to $5,000 per human genome, or $4,000 if ordering 50 or more.{{Cite web | url=http://www.bio-itworld.com/news/05/09/2011/Illumina-announces-five-thousand-dollar-genome.html | archive-url=https://web.archive.org/web/20110517091932/http://www.bio-itworld.com/news/05/09/2011/Illumina-announces-five-thousand-dollar-genome.html | url-status=dead | archive-date=2011-05-17 | title=Illumina Announces $5,000 Genome Pricing – Bio-IT World}}

Helicos Biosciences, Pacific Biosciences, Complete Genomics, Illumina, Sequenom, ION Torrent Systems, Halcyon Molecular, NABsys, IBM, and GE Global appear to all be going head to head in the race to commercialize full genome sequencing.{{Cite journal|journal=Genome Web|url=http://www.genomeweb.com/sequencing/nhgri-awards-more-50m-low-cost-dna-sequencing-tech-development?page=show|title=NHGRI Awards More than $50M for Low-Cost DNA Sequencing Tech Development|date=2009|url-status=live|archive-url=https://web.archive.org/web/20110703232436/http://www.genomeweb.com/sequencing/nhgri-awards-more-50m-low-cost-dna-sequencing-tech-development?page=show|archive-date=2011-07-03}}

With sequencing costs declining, a number of companies began claiming that their equipment would soon achieve the $1,000 genome: these companies included Life Technologies in January 2012,{{cite web |title= Life Technologies Introduces the Benchtop Ion Proton™ Sequencer; Designed to Decode a Human Genome in One Day for $1,000 |type= press release |url= http://www.lifetechnologies.com/us/en/home/about-us/news-gallery/press-releases/2012/life-techologies-itroduces-the-bechtop-io-proto.html.html |access-date=August 30, 2012 |url-status=dead |archive-url=https://web.archive.org/web/20121223070004/http://www.lifetechnologies.com/us/en/home/about-us/news-gallery/press-releases/2012/life-techologies-itroduces-the-bechtop-io-proto.html.html |archive-date= December 23, 2012 }} Oxford Nanopore Technologies in February 2012,{{cite news |author=ANDREW POLLACK |url=https://www.nytimes.com/2012/02/18/health/oxford-nanopore-unveils-tiny-dna-sequencing-device.html |title=Oxford Nanopore Unveils Tiny DNA Sequencing Device – The New York Times |newspaper=The New York Times |date=2012-02-17 |access-date=2016-11-11 |url-status=live |archive-url=https://web.archive.org/web/20130107022605/http://www.nytimes.com/2012/02/18/health/oxford-nanopore-unveils-tiny-dna-sequencing-device.html |archive-date=2013-01-07 }} and Illumina in February 2014.{{cite news | title=Illumina Sequencer Enables $1,000 Genome | type=paper | page=18 | volume=34 | issue=4 | date=15 February 2014 | work=Gen. Eng. Biotechnol. News | department=News: Genomics & Proteomics }}{{cite journal |last1=Check Hayden |first1=Erika |title=Is the $1,000 genome for real? |journal=Nature |date=15 January 2014 |pages=nature.2014.14530 |doi=10.1038/nature.2014.14530 |s2cid=211730238 }} In 2015, the NHGRI estimated the cost of obtaining a whole-genome sequence at around $1,500.{{cite web|title=The Cost of Sequencing a Human Genome|url=https://www.genome.gov/27565109/the-cost-of-sequencing-a-human-genome/|website=www.genome.gov|url-status=live|archive-url=https://web.archive.org/web/20161125224738/https://www.genome.gov/27565109/the-cost-of-sequencing-a-human-genome/|archive-date=2016-11-25}} In 2016, Veritas Genetics began selling whole genome sequencing, including a report as to some of the information in the sequencing for $999.{{Cite web | url=https://www.genomeweb.com/sequencing-technology/999-whole-genome-sequencing-service-veritas-embarks-goal-democratize-dna | title=With $999 Whole-Genome Sequencing Service, Veritas Embarks on Goal to Democratize DNA Information| date=6 March 2016}} In summer 2019, Veritas Genetics cut the cost for WGS to $599.{{Cite web|url=https://www.cnbc.com/2019/07/01/for-600-veritas-genetics-sequences-6point4-billion-letters-of-your-dna.html|title=23andMe competitor Veritas Genetics slashes price of whole genome sequencing 40% to $600|last=Andrews|first=Joe|date=2019-07-01|website=CNBC|language=en|access-date=2019-09-02}} In 2017, BGI began offering WGS for $600.{{cite news|url=https://www.wired.com/2017/05/chinese-genome-giant-sets-sights-uitimate-sequencer/|title=A Chinese Genome Giant Sets Its Sights on the Ultimate Sequencer|author=Megan Molteni|date=18 May 2017|publisher=Wired|access-date=2 December 2018}}

However, in 2015, some noted that effective use of whole gene sequencing can cost considerably more than $1000.{{Cite journal|pmc = 4527943|year = 2015|last1 = Phillips|first1 = K. A|title = Is the "$1000 Genome" Really $1000? Understanding the Full Benefits and Costs of Genomic Sequencing|journal = Technology and Health Care|volume = 23|issue = 3|pages = 373–379|last2 = Pletcher|first2 = M. J|last3 = Ladabaum|first3 = U|pmid = 25669213|doi = 10.3233/THC-150900}} Also, reportedly there remain parts of the human genome that have not been fully sequenced by 2017.{{Cite web | url=https://www.veritasgenetics.com/our-thinking/whole-story | title=Blog: True Size of a Human Genome | Veritas Genetics| date=28 July 2017}}{{Cite web | url=https://www.statnews.com/2017/06/20/human-genome-not-fully-sequenced/ | title=Psst, the human genome was never completely sequenced| date=2017-06-20 |work=statnews.com}}

Comparison with other technologies

= DNA microarrays =

Full genome sequencing provides information on a genome that is orders of magnitude larger than by DNA arrays, the previous leader in genotyping technology.

For humans, DNA arrays currently provide genotypic information on up to one million genetic variants,{{cite web|url=http://www.gladstone.ucsf.edu/gladstone/site/genomicscore/section/1919 |title=Genomics Core |publisher=Gladstone.ucsf.edu |access-date=2009-02-23 |url-status=dead |archive-url=https://web.archive.org/web/20100630111736/http://www.gladstone.ucsf.edu/gladstone/site/genomicscore/section/1919 |archive-date=June 30, 2010 }}{{cite journal | author = Nishida N | author2 = Koike A | author3 = Tajima A| author4 = Ogasawara Y | author5 = Ishibashi Y| author6 = Uehara Y | author7 = Inoue I| author8 = Tokunaga K | title = Evaluating the performance of Affymetrix SNP Array 6.0 platform with 400 Japanese individuals | journal = BMC Genomics | volume = 9 | issue = 1| page = 431 | date = 2008 | pmid = 18803882 | pmc = 2566316 | doi = 10.1186/1471-2164-9-431 |doi-access=free}}{{cite web |last=Petrone |first=Justin |url=http://www.genomeweb.com/arrays/illumina-decode-build-1m-snp-chip-q2-launch-coincide-release-affys-60-snp-array |title=Illumina, DeCode Build 1M SNP Chip; Q2 Launch to Coincide with Release of Affy's 6.0 SNP Array | BioArray News | Arrays |date=16 January 2007 |publisher=GenomeWeb |access-date=2009-02-23 |url-status=live |archive-url=https://web.archive.org/web/20110716115318/http://www.genomeweb.com/arrays/illumina-decode-build-1m-snp-chip-q2-launch-coincide-release-affys-60-snp-array |archive-date=2011-07-16 }} while full genome sequencing will provide information on all six billion bases in the human genome, or 3,000 times more data. Because of this, full genome sequencing is considered a disruptive innovation to the DNA array markets as the accuracy of both range from 99.98% to 99.999% (in non-repetitive DNA regions) and their consumables cost of $5000 per 6 billion base pairs is competitive (for some applications) with DNA arrays ($500 per 1 million basepairs).

Applications

= Mutation frequencies =

Whole genome sequencing has established the mutation frequency for whole human genomes. The mutation frequency in the whole genome between generations for humans (parent to child) is about 70 new mutations per generation.{{cite journal |author=Roach JC |author2=Glusman G |author3=Smit AF|title=Analysis of genetic inheritance in a family quartet by whole-genome sequencing |journal=Science |volume=328 |issue=5978 |pages=636–9 |date=April 2010 |pmid=20220176 |pmc=3037280 |doi=10.1126/science.1186802 |display-authors=etal|bibcode=2010Sci...328..636R }}{{cite journal |author=Campbell CD |author2=Chong JX |author3=Malig M |title=Estimating the human mutation rate using autozygosity in a founder population |journal=Nat. Genet. |volume=44 |issue=11 |pages=1277–81 |date=November 2012 |pmid=23001126 |pmc=3483378 |doi=10.1038/ng.2418 |display-authors=etal}} An even lower level of variation was found comparing whole genome sequencing in blood cells for a pair of monozygotic (identical twins) 100-year-old centenarians.{{cite journal |author=Ye K |author2=Beekman M |author3=Lameijer EW|author4=Zhang Y |author5=Moed MH |author6=van den Akker EB |author7=Deelen J|author8=Houwing-Duistermaat JJ |author9=Kremer D|author10=Anvar SY |author11=Laros JF |author12=Jones D |author13=Raine K|author14=Blackburne B |author15=Potluri S|author16=Long Q |author17=Guryev V |author18=van der Breggen R |author19=Westendorp RG |author20='t Hoen PA |author21=den Dunnen J |author22=van Ommen GJ |author23=Willemsen G|author24=Pitts SJ |author25=Cox DR |author26=Ning Z |author27=Boomsma DI |author28=Slagboom PE |title=Aging as accelerated accumulation of somatic variants: whole-genome sequencing of centenarian and middle-aged monozygotic twin pairs |journal=Twin Res Hum Genet |volume=16 |issue=6 |pages=1026–32 |date=December 2013 |pmid=24182360 |doi=10.1017/thg.2013.73 |doi-access=free }} Only 8 somatic differences were found, though somatic variation occurring in less than 20% of blood cells would be undetected.

In the specifically protein coding regions of the human genome, it is estimated that there are about 0.35 mutations that would change the protein sequence between parent/child generations (less than one mutated protein per generation).{{cite journal |author=Keightley PD |title=Rates and fitness consequences of new mutations in humans |journal=Genetics |volume=190 |issue=2 |pages=295–304 |date=February 2012 |pmid=22345605 |pmc=3276617 |doi=10.1534/genetics.111.134668 }}

In cancer, mutation frequencies are much higher, due to genome instability. This frequency can further depend on patient age, exposure to DNA damaging agents (such as UV-irradiation or components of tobacco smoke) and the activity/inactivity of DNA repair mechanisms.Mustjoki S, Young NS. Somatic Mutations in "Benign" Disease. N Engl J Med. 2021 May 27;384(21):2039-2052. doi: 10.1056/NEJMra2101920. PMID: 34042390. Furthermore, mutation frequency can vary between cancer types: in germline cells, mutation rates occur at approximately 0.023 mutations per megabase, but this number is much higher in breast cancer (1.18-1.66 somatic mutations per Mb), in lung cancer (17.7) or in melanomas (≈33).{{cite journal |author=Tuna M |author2=Amos CI |title=Genomic sequencing in cancer |journal=Cancer Lett. |volume=340 |issue=2 |pages=161–70 |date=November 2013 |pmid=23178448 |doi=10.1016/j.canlet.2012.11.004 |pmc=3622788 }} Since the haploid human genome consists of approximately 3,200 megabases,{{cite web|url=http://sandwalk.blogspot.com/2011/03/how-big-is-human-genome.html|title=Sandwalk: How Big Is the Human Genome?|first=Laurence A.|last=Moran|date=24 March 2011|website=sandwalk.blogspot.com|access-date=29 April 2018|url-status=live|archive-url=https://web.archive.org/web/20171201131353/http://sandwalk.blogspot.com/2011/03/how-big-is-human-genome.html|archive-date=1 December 2017}} this translates into about 74 mutations (mostly in noncoding regions) in germline DNA per generation, but 3,776-5,312 somatic mutations per haploid genome in breast cancer, 56,640 in lung cancer and 105,600 in melanomas.

The distribution of somatic mutations across the human genome is very uneven,{{cite journal |last1=Hodgkinson |first1=Alan |last2=Chen |first2=Ying |last3=Eyre-Walker |first3=Adam |title=The large-scale distribution of somatic mutations in cancer genomes |journal=Human Mutation |date=January 2012 |volume=33 |issue=1 |pages=136–143 |doi=10.1002/humu.21616 |pmid=21953857 |s2cid=19353116 }} such that the gene-rich, early-replicating regions receive fewer mutations than gene-poor, late-replicating heterochromatin, likely due to differential DNA repair activity.{{cite journal |last1=Supek |first1=Fran |last2=Lehner |first2=Ben |title=Differential DNA mismatch repair underlies mutation rate variation across the human genome |journal=Nature |date=May 2015 |volume=521 |issue=7550 |pages=81–84 |doi=10.1038/nature14173 |pmid=25707793 |pmc=4425546 |bibcode=2015Natur.521...81S }} In particular, the histone modification H3K9me3 is associated with high,{{cite journal |last1=Schuster-Böckler |first1=Benjamin |last2=Lehner |first2=Ben |title=Chromatin organization is a major influence on regional mutation rates in human cancer cells |journal=Nature |date=August 2012 |volume=488 |issue=7412 |pages=504–507 |doi=10.1038/nature11273 |pmid=22820252 |bibcode=2012Natur.488..504S |s2cid=205229634 }} and H3K36me3 with low mutation frequencies.{{cite journal |last1=Supek |first1=Fran |last2=Lehner |first2=Ben |title=Clustered Mutation Signatures Reveal that Error-Prone DNA Repair Targets Mutations to Active Genes |journal=Cell |date=July 2017 |volume=170 |issue=3 |pages=534–547.e23 |doi=10.1016/j.cell.2017.07.003 |pmid=28753428 |hdl=10230/35343 |doi-access=free |hdl-access=free }}

= Genome-wide association studies =

In research, whole-genome sequencing can be used in a Genome-Wide Association Study (GWAS) – a project aiming to determine the genetic variant or variants associated with a disease or some other phenotype.{{cite journal|last1=Yano|first1=K|last2=Yamamoto|first2=E|last3=Aya|first3=K|last4=Takeuchi|first4=H|last5=Lo|first5=PC|last6=Hu|first6=L|last7=Yamasaki|first7=M|last8=Yoshida|first8=S|last9=Kitano|first9=H|last10=Hirano|first10=K|last11=Matsuoka|first11=M|title=Genome-wide association study using whole-genome sequencing rapidly identifies new genes influencing agronomic traits in rice.|journal=Nature Genetics|date=August 2016|volume=48|issue=8|pages=927–34|pmid=27322545|doi=10.1038/ng.3596|s2cid=22427006}}

= Diagnostic use =

{{Further|Personal genomics|predictive medicine|elective genetic and genomic testing}}

In 2009, Illumina released its first whole genome sequencers that were approved for clinical as opposed to research-only use and doctors at academic medical centers began quietly using them to try to diagnose what was wrong with people whom standard approaches had failed to help.{{cite journal |last1=Borrell |first1=Brendan |title=US clinics quietly embrace whole-genome sequencing |journal=Nature |date=14 September 2010 |pages=news.2010.465 |doi=10.1038/news.2010.465 |doi-access= }} In 2009, a team from Stanford led by Euan Ashley performed clinical interpretation of a full human genome, that of bioengineer Stephen Quake.{{cite journal |last1=Ashley |first1=EA |last2=Butte |first2=AJ |last3=Wheeler |first3=MT |last4=Chen |first4=R |last5=Klein |first5=TE |last6=Dewey |first6=FE |last7=Dudley |first7=JT |last8=Ormond |first8=KE |last9=Pavlovic |first9=A |last10=Morgan |first10=AA |last11=Pushkarev |first11=D |last12=Neff |first12=NF |last13=Hudgins |first13=L |last14=Gong |first14=L |last15=Hodges |first15=LM |last16=Berlin |first16=DS |last17=Thorn |first17=CF |last18=Sangkuhl |first18=K |last19=Hebert |first19=JM |last20=Woon |first20=M |last21=Sagreiya |first21=H |last22=Whaley |first22=R |last23=Knowles |first23=JW |last24=Chou |first24=MF |last25=Thakuria |first25=JV |last26=Rosenbaum |first26=AM |last27=Zaranek |first27=AW |last28=Church |first28=GM |last29=Greely |first29=HT |last30=Quake |first30=SR |last31=Altman |first31=RB |title=Clinical assessment incorporating a personal genome. |journal=Lancet |date=1 May 2010 |volume=375 |issue=9725 |pages=1525–35 |doi=10.1016/S0140-6736(10)60452-7 |pmid=20435227|pmc=2937184 |author-link5=Teri Klein}} In 2010, Ashley's team reported whole genome molecular autopsy{{cite journal |last1=Dewey |first1=Frederick E. |last2=Wheeler |first2=Matthew T. |last3=Cordero |first3=Sergio |last4=Perez |first4=Marco V. |last5=Pavlovic |first5=Aleks |last6=Pushkarev |first6=Dmitry |last7=Freeman |first7=James V. |last8=Quake |first8=Steve R. |last9=Ashley |first9=Euan A. |title=Molecular Autopsy for Sudden Cardiac Death Using Whole Genome Sequencing |journal=Journal of the American College of Cardiology |date=April 2011 |volume=57 |issue=14 |pages=E1159 |doi=10.1016/S0735-1097(11)61159-5|doi-access= }} and in 2011, extended the interpretation framework to a fully sequenced family, the West family, who were the first family to be sequenced on the Illumina platform.{{cite journal |last1=Dewey |first1=Frederick E. |last2=Chen |first2=Rong |last3=Cordero |first3=Sergio P. |last4=Ormond |first4=Kelly E. |last5=Caleshu |first5=Colleen |last6=Karczewski |first6=Konrad J. |last7=Whirl-Carrillo |first7=Michelle |last8=Wheeler |first8=Matthew T. |last9=Dudley |first9=Joel T. |last10=Byrnes |first10=Jake K. |last11=Cornejo |first11=Omar E. |last12=Knowles |first12=Joshua W. |last13=Woon |first13=Mark |last14=Sangkuhl |first14=Katrin |last15=Gong |first15=Li |last16=Thorn |first16=Caroline F. |last17=Hebert |first17=Joan M. |last18=Capriotti |first18=Emidio |last19=David |first19=Sean P. |last20=Pavlovic |first20=Aleksandra |last21=West |first21=Anne |last22=Thakuria |first22=Joseph V. |last23=Ball |first23=Madeleine P. |last24=Zaranek |first24=Alexander W. |last25=Rehm |first25=Heidi L. |last26=Church |first26=George M. |last27=West |first27=John S. |last28=Bustamante |first28=Carlos D. |last29=Snyder |first29=Michael |last30=Altman |first30=Russ B. |last31=Klein |first31=Teri E. |last32=Butte |first32=Atul J. |last33=Ashley |first33=Euan A. |title=Phased Whole-Genome Genetic Risk in a Family Quartet Using a Major Allele Reference Sequence |journal=PLOS Genetics |date=15 September 2011 |volume=7 |issue=9 |pages=e1002280 |doi=10.1371/journal.pgen.1002280|pmid=21935354 |pmc=3174201 |doi-access=free }} The price to sequence a genome at that time was $19,500{{Nbsp}}USD, which was billed to the patient but usually paid for out of a research grant; one person at that time had applied for reimbursement from their insurance company. For example, one child had needed around 100 surgeries by the time he was three years old, and his doctor turned to whole genome sequencing to determine the problem; it took a team of around 30 people that included 12 bioinformatics experts, three sequencing technicians, five physicians, two genetic counsellors and two ethicists to identify a rare mutation in the XIAP that was causing widespread problems.{{cite web |url=http://www.jsonline.com/features/health/111224104.html |title=One In A Billion: A boy's life, a medical mystery |website=Jsonline.com |access-date=2016-11-11 |url-status=live |archive-url=https://web.archive.org/web/20131005132647/http://www.jsonline.com/features/health/111224104.html |archive-date=2013-10-05 }}{{cite journal |vauthors=Mayer AN, Dimmock DP, Arca MJ, etal |title=A timely arrival for genomic medicine |journal=Genet. Med. |volume=13 |issue=3 |pages=195–6 |date=March 2011 |pmid=21169843 |doi=10.1097/GIM.0b013e3182095089|s2cid=10802499 |doi-access=free }}

Due to recent cost reductions (see above) whole genome sequencing has become a realistic application in DNA diagnostics. In 2013, the 3Gb-TEST consortium obtained funding from the European Union to prepare the health care system for these innovations in DNA diagnostics.{{cite web |url=http://cordis.europa.eu/project/rcn/109460_en.html |title=Introducing diagnostic applications of '3Gb-testing' in human genetics |url-status=live |archive-url=https://web.archive.org/web/20141110021815/http://cordis.europa.eu/project/rcn/109460_en.html |archive-date=2014-11-10 }}{{cite journal|title=Beyond public health genomics: proposals from an international working group|pmid=25168910 | doi=10.1093/eurpub/cku142 | date=August 2014|journal=Eur J Public Health|volume=24|issue=6 |pages=877–879|pmc=4245010|vauthors=Boccia S, Mc Kee M, Adany R, Boffetta P, Burton H, Cambon-Thomsen A, Cornel MC, Gray M, Jani A, Knoppers BM, Khoury MJ, Meslin EM, Van Duijn CM, Villari P, Zimmern R, Cesario A, Puggina A, Colotto M, Ricciardi W}} Quality assessment schemes, Health technology assessment and guidelines have to be in place. The 3Gb-TEST consortium has identified the analysis and interpretation of sequence data as the most complicated step in the diagnostic process.{{cite web|url=http://rd-connect.eu/?wysija-page=1&controller=email&action=view&email_id=13&wysijap=subscriptions|title=RD-Connect News 18 July 2014|website=Rd-connect.eu|access-date=2016-11-11|url-status=live|archive-url=https://web.archive.org/web/20161010031842/http://rd-connect.eu/?wysija-page=1&controller=email&action=view&email_id=13&wysijap=subscriptions|archive-date=10 October 2016}} At the Consortium meeting in Athens in September 2014, the Consortium coined the word genotranslation for this crucial step. This step leads to a so-called genoreport. Guidelines are needed to determine the required content of these reports.{{cn|date=March 2024}}

Genomes2People (G2P), an initiative of Brigham and Women's Hospital and Harvard Medical School was created in 2011 to examine the integration of genomic sequencing into clinical care of adults and children.{{cite web|url=http://www.frontlinegenomics.com/interview/5409/genomes2people-roadmap-genomic-medicine/|title=Genomes2People: A Roadmap for Genomic Medicine|website=www.frontlinegenomics.com|access-date=29 April 2018|url-status=live|archive-url=https://web.archive.org/web/20170214004336/http://www.frontlinegenomics.com/interview/5409/genomes2people-roadmap-genomic-medicine/|archive-date=14 February 2017}} G2P's director, Robert C. Green, had previously led the REVEAL study — Risk EValuation and Education for Alzheimer's Disease – a series of clinical trials exploring patient reactions to the knowledge of their genetic risk for Alzheimer's.{{cite web|url=http://hbhegenetics.sph.umich.edu/research-project/risk-evaluation-and-education-alzheimers-disease-study|title=The Risk Evaluation and Education for Alzheimer's Disease (REVEAL) Study – HBHE Genetics Research Group|website=hbhegenetics.sph.umich.edu|access-date=29 April 2018|url-status=live|archive-url=https://web.archive.org/web/20170929073819/http://hbhegenetics.sph.umich.edu/research-project/risk-evaluation-and-education-alzheimers-disease-study|archive-date=29 September 2017}}{{cite journal|url=https://clinicaltrials.gov/ct2/show/NCT00089882|title=Risk Evaluation and Education for Alzheimer's Disease (REVEAL) II – Full Text View – ClinicalTrials.gov|website=clinicaltrials.gov|date=22 July 2009|access-date=29 April 2018|url-status=live|archive-url=https://web.archive.org/web/20170214003156/https://clinicaltrials.gov/ct2/show/NCT00089882|archive-date=14 February 2017}}

In 2018, researchers at Rady Children's Hospital Institute for Genomic Medicine in San Diego determined that rapid whole-genome sequencing (rWGS) could diagnose genetic disorders in time to change acute medical or surgical management (clinical utility) and improve outcomes in acutely ill infants. In a retrospective cohort study of acutely ill inpatient infants in a regional children's hospital from July 2016-March 2017, forty-two families received rWGS for etiologic diagnosis of genetic disorders. The diagnostic sensitivity of rWGS was 43% (eighteen of 42 infants) and 10% (four of 42 infants) for standard genetic tests (P = .0005). The rate of clinical utility of rWGS (31%, thirteen of 42 infants) was significantly greater than for standard genetic tests (2%, one of 42; P = .0015). Eleven (26%) infants with diagnostic rWGS avoided morbidity, one had a 43% reduction in likelihood of mortality, and one started palliative care. In six of the eleven infants, the changes in management reduced inpatient cost by $800,000-$2,000,000. The findings replicated a prior study of the clinical utility of rWGS in acutely ill inpatient infants, and demonstrated improved outcomes, net healthcare savings and consideration as a first tier test in this setting.{{cite journal |last1=Farnaes |first1=Lauge |last2=Hildreth |first2=Amber |last3=Sweeney |first3=Nathaly M. |last4=Clark |first4=Michelle M. |last5=Chowdhury |first5=Shimul |last6=Nahas |first6=Shareef |last7=Cakici |first7=Julie A. |last8=Benson |first8=Wendy |last9=Kaplan |first9=Robert H. |last10=Kronick |first10=Richard |last11=Bainbridge |first11=Matthew N. |last12=Friedman |first12=Jennifer |last13=Gold |first13=Jeffrey J. |last14=Ding |first14=Yan |last15=Veeraraghavan |first15=Narayanan |last16=Dimmock |first16=David |last17=Kingsmore |first17=Stephen F. |title=Rapid whole-genome sequencing decreases infant morbidity and cost of hospitalization |journal=npj Genomic Medicine |date=December 2018 |volume=3 |issue=1 |pages=10 |doi=10.1038/s41525-018-0049-4 |pmid=29644095 |pmc=5884823 }}

A 2018 review of 36 publications found the cost for whole genome sequencing to range from $1,906{{Nbsp}}USD to $24,810{{Nbsp}}USD and have a wide variance in diagnostic yield from 17% to 73% depending on patient groups.{{cite journal |last1=Schwarze |first1=K |last2=Buchanan |first2=J |last3=Taylor |first3=Jc |last4=Wordsworth |first4=S |title=Are whole Exome and whole Genome Sequencing Approaches Cost-Effective? A Systematic Review of the Literature |journal=Value in Health |date=May 2018 |volume=21 |pages=S100 |doi=10.1016/j.jval.2018.04.677 |doi-access=free }}

= Rare variant association study =

Whole genome sequencing studies enable the assessment of associations between complex traits and both coding and noncoding rare variants (minor allele frequency (MAF) < 1%) across the genome. Single-variant analyses typically have low power to identify associations with rare variants, and variant set tests have been proposed to jointly test the effects of given sets of multiple rare variants.{{cite journal |last1=Lee |first1=Seunggeung |last2=Abecasis |first2=Gonçalo R. |last3=Boehnke |first3=Michael |last4=Lin |first4=Xihong |title=Rare-Variant Association Analysis: Study Designs and Statistical Tests |journal=The American Journal of Human Genetics |date=July 2014 |volume=95 |issue=1 |pages=5–23 |doi=10.1016/j.ajhg.2014.06.009 |pmid=24995866 |pmc=4085641 }} SNP annotations help to prioritize rare functional variants, and incorporating these annotations can effectively boost the power of genetic association of rare variants analysis of whole genome sequencing studies.{{cite journal |last1=Li |first1=Xihao |last2=Li |first2=Zilin |last3=Zhou |first3=Hufeng |last4=Gaynor |first4=Sheila M. |last5=Liu |first5=Yaowu |last6=Chen |first6=Han |last7=Sun |first7=Ryan |last8=Dey |first8=Rounak |last9=Arnett |first9=Donna K. |last10=Aslibekyan |first10=Stella |display-authors=3 |title=Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale |journal=Nature Genetics |date=September 2020 |volume=52 |issue=9 |pages=969–983 |doi=10.1038/s41588-020-0676-4 |pmid=32839606 |pmc=7483769 }} Some tools have been specifically developed to provide all-in-one rare variant association analysis for whole-genome sequencing data, including integration of genotype data and their functional annotations, association analysis, result summary and visualization.{{cite journal |last1=Li |first1=Zilin |last2=Li |first2=Xihao |last3=Zhou |first3=Hufeng |last4=Gaynor |first4=Sheila M. |last5=Selvaraj |first5=Margaret Sunitha |last6=Arapoglou |first6=Theodore |last7=Quick |first7=Corbin |last8=Liu |first8=Yaowu |last9=Chen |first9=Han |last10=Sun |first10=Ryan |last11=Dey |first11=Rounak |last12=Arnett |first12=Donna K. |last13=Auer |first13=Paul L. |last14=Bielak |first14=Lawrence F. |last15=Bis |first15=Joshua C. |last16=Blackwell |first16=Thomas W. |last17=Blangero |first17=John |last18=Boerwinkle |first18=Eric |last19=Bowden |first19=Donald W. |last20=Brody |first20=Jennifer A. |last21=Cade |first21=Brian E. |last22=Conomos |first22=Matthew P. |last23=Correa |first23=Adolfo |last24=Cupples |first24=L. Adrienne |last25=Curran |first25=Joanne E. |last26=de Vries |first26=Paul S. |last27=Duggirala |first27=Ravindranath |last28=Franceschini |first28=Nora |last29=Freedman |first29=Barry I. |last30=Göring |first30=Harald H. H. |last31=Guo |first31=Xiuqing |last32=Kalyani |first32=Rita R. |last33=Kooperberg |first33=Charles |last34=Kral |first34=Brian G. |last35=Lange |first35=Leslie A. |last36=Lin |first36=Bridget M. |last37=Manichaikul |first37=Ani |last38=Manning |first38=Alisa K. |last39=Martin |first39=Lisa W. |last40=Mathias |first40=Rasika A. |last41=Meigs |first41=James B. |last42=Mitchell |first42=Braxton D. |last43=Montasser |first43=May E. |last44=Morrison |first44=Alanna C. |last45=Naseri |first45=Take |last46=O'Connell |first46=Jeffrey R. |last47=Palmer |first47=Nicholette D. |last48=Peyser |first48=Patricia A. |last49=Psaty |first49=Bruce M. |last50=Raffield |first50=Laura M. |last51=Redline |first51=Susan |last52=Reiner |first52=Alexander P. |last53=Reupena |first53=Muagututi'a Sefuiva |last54=Rice |first54=Kenneth M. |last55=Rich |first55=Stephen S. |last56=Smith |first56=Jennifer A. |last57=Taylor |first57=Kent D. |last58=Taub |first58=Margaret A. |last59=Vasan |first59=Ramachandran S. |last60=Weeks |first60=Daniel E. |last61=Wilson |first61=James G. |last62=Yanek |first62=Lisa R. |last63=Zhao |first63=Wei |last64=Abe |first64=Namiko |last65=Abecasis |first65=Gonçalo |last66=Aguet |first66=Francois |last67=Albert |first67=Christine |last68=Almasy |first68=Laura |last69=Alonso |first69=Alvaro |last70=Ament |first70=Seth |last71=Anderson |first71=Peter |last72=Anugu |first72=Pramod |last73=Applebaum-Bowden |first73=Deborah |last74=Ardlie |first74=Kristin |last75=Arking |first75=Dan |last76=Ashley-Koch |first76=Allison |last77=Aslibekyan |first77=Stella |last78=Assimes |first78=Tim |last79=Avramopoulos |first79=Dimitrios |last80=Ayas |first80=Najib |last81=Balasubramanian |first81=Adithya |last82=Barnard |first82=John |last83=Barnes |first83=Kathleen |last84=Barr |first84=R. Graham |last85=Barron-Casella |first85=Emily |last86=Barwick |first86=Lucas |last87=Beaty |first87=Terri |last88=Beck |first88=Gerald |last89=Becker |first89=Diane |last90=Becker |first90=Lewis |last91=Beer |first91=Rebecca |last92=Beitelshees |first92=Amber |last93=Benjamin |first93=Emelia |last94=Benos |first94=Takis |last95=Bezerra |first95=Marcos |last96=Blue |first96=Nathan |last97=Bowler |first97=Russell |last98=Broeckel |first98=Ulrich |last99=Broome |first99=Jai |last100=Brown |first100=Deborah |last101=Bunting |first101=Karen |last102=Burchard |first102=Esteban |last103=Bustamante |first103=Carlos |last104=Buth |first104=Erin |last105=Cardwell |first105=Jonathan |last106=Carey |first106=Vincent |last107=Carrier |first107=Julie |last108=Carson |first108=April |last109=Carty |first109=Cara |last110=Casaburi |first110=Richard |last111=Casas Romero |first111=Juan P. |last112=Casella |first112=James |last113=Castaldi |first113=Peter |last114=Chaffin |first114=Mark |last115=Chang |first115=Christy |last116=Chang |first116=Yi-Cheng |last117=Chasman |first117=Daniel |last118=Chavan |first118=Sameer |last119=Chen |first119=Bo-Juen |last120=Chen |first120=Wei-Min |last121=Chen |first121=Yii-Der Ida |last122=Cho |first122=Michael |last123=Choi |first123=Seung Hoan |last124=Chuang |first124=Lee-Ming |last125=Chung |first125=Mina |last126=Chung |first126=Ren-Hua |last127=Clish |first127=Clary |last128=Comhair |first128=Suzy |last129=Cornell |first129=Elaine |last130=Crandall |first130=Carolyn |last131=Crapo |first131=James |last132=Curtis |first132=Jeffrey |last133=Custer |first133=Brian |last134=Damcott |first134=Coleen |last135=Darbar |first135=Dawood |last136=David |first136=Sean |last137=Davis |first137=Colleen |last138=Daya |first138=Michelle |last139=de Andrade |first139=Mariza |last140=Fuentes |first140=Lisa de las |last141=DeBaun |first141=Michael |last142=Deka |first142=Ranjan |last143=DeMeo |first143=Dawn |last144=Devine |first144=Scott |last145=Dinh |first145=Huyen |last146=Doddapaneni |first146=Harsha |last147=Duan |first147=Qing |last148=Dugan-Perez |first148=Shannon |last149=Durda |first149=Jon Peter |last150=Dutcher |first150=Susan K. |last151=Eaton |first151=Charles |last152=Ekunwe |first152=Lynette |last153=El Boueiz |first153=Adel |last154=Ellinor |first154=Patrick |last155=Emery |first155=Leslie |last156=Erzurum |first156=Serpil |last157=Farber |first157=Charles |last158=Farek |first158=Jesse |last159=Fingerlin |first159=Tasha |last160=Flickinger |first160=Matthew |last161=Fornage |first161=Myriam |last162=Frazar |first162=Chris |last163=Fu |first163=Mao |last164=Fullerton |first164=Stephanie M. |last165=Fulton |first165=Lucinda |last166=Gabriel |first166=Stacey |last167=Gan |first167=Weiniu |last168=Gao |first168=Shanshan |last169=Gao |first169=Yan |last170=Gass |first170=Margery |last171=Geiger |first171=Heather |last172=Gelb |first172=Bruce |last173=Geraci |first173=Mark |last174=Germer |first174=Soren |last175=Gerszten |first175=Robert |last176=Ghosh |first176=Auyon |last177=Gibbs |first177=Richard |last178=Gignoux |first178=Chris |last179=Gladwin |first179=Mark |last180=Glahn |first180=David |last181=Gogarten |first181=Stephanie |last182=Gong |first182=Da-Wei |last183=Graw |first183=Sharon |last184=Gray |first184=Kathryn J. |last185=Grine |first185=Daniel |last186=Gross |first186=Colin |last187=Gu |first187=C. Charles |last188=Guan |first188=Yue |last189=Gupta |first189=Namrata |last190=Hall |first190=Michael |last191=Han |first191=Yi |last192=Hanly |first192=Patrick |last193=Harris |first193=Daniel |last194=Hawley |first194=Nicola L. |last195=He |first195=Jiang |last196=Heavner |first196=Ben |last197=Heckbert |first197=Susan |last198=Hernandez |first198=Ryan |last199=Herrington |first199=David |last200=Hersh |first200=Craig |last201=Hidalgo |first201=Bertha |last202=Hixson |first202=James |last203=Hobbs |first203=Brian |last204=Hokanson |first204=John |last205=Hong |first205=Elliott |last206=Hoth |first206=Karin |last207=Hsiung |first207=Chao |last208=Hu |first208=Jianhong |last209=Hung |first209=Yi-Jen |last210=Huston |first210=Haley |last211=Hwu |first211=Chii Min |last212=Irvin |first212=Marguerite Ryan |last213=Jackson |first213=Rebecca |last214=Jain |first214=Deepti |last215=Jaquish |first215=Cashell |last216=Johnsen |first216=Jill |last217=Johnson |first217=Andrew |last218=Johnson |first218=Craig |last219=Johnston |first219=Rich |last220=Jones |first220=Kimberly |last221=Kang |first221=Hyun Min |last222=Kaplan |first222=Robert |last223=Kardia |first223=Sharon |last224=Kelly |first224=Shannon |last225=Kenny |first225=Eimear |last226=Kessler |first226=Michael |last227=Khan |first227=Alyna |last228=Khan |first228=Ziad |last229=Kim |first229=Wonji |last230=Kimoff |first230=John |last231=Kinney |first231=Greg |last232=Konkle |first232=Barbara |last233=Kramer |first233=Holly |last234=Lange |first234=Christoph |last235=Lange |first235=Ethan |last236=Laurie |first236=Cathy |last237=Laurie |first237=Cecelia |last238=LeBoff |first238=Meryl |last239=Lee |first239=Jiwon |last240=Lee |first240=Sandra |last241=Lee |first241=Wen-Jane |last242=LeFaive |first242=Jonathon |last243=Levine |first243=David |last244=Lewis |first244=Joshua |last245=Li |first245=Xiaohui |last246=Li |first246=Yun |last247=Lin |first247=Henry |last248=Lin |first248=Honghuang |last249=Liu |first249=Simin |last250=Liu |first250=Yongmei |last251=Liu |first251=Yu |last252=Loos |first252=Ruth J. F. |last253=Lubitz |first253=Steven |last254=Lunetta |first254=Kathryn |last255=Luo |first255=James |last256=Magalang |first256=Ulysses |last257=Mahaney |first257=Michael |last258=Make |first258=Barry |last259=Manson |first259=JoAnn |last260=Marton |first260=Melissa |last261=Mathai |first261=Susan |last262=May |first262=Susanne |last263=McArdle |first263=Patrick |last264=McDonald |first264=Merry-Lynn |last265=McFarland |first265=Sean |last266=McGoldrick |first266=Daniel |last267=McHugh |first267=Caitlin |last268=McNeil |first268=Becky |last269=Mei |first269=Hao |last270=Menon |first270=Vipin |last271=Mestroni |first271=Luisa |last272=Metcalf |first272=Ginger |last273=Meyers |first273=Deborah A. |last274=Mignot |first274=Emmanuel |last275=Mikulla |first275=Julie |last276=Min |first276=Nancy |last277=Minear |first277=Mollie |last278=Minster |first278=Ryan L. |last279=Moll |first279=Matt |last280=Momin |first280=Zeineen |last281=Montgomery |first281=Courtney |last282=Muzny |first282=Donna |last283=Mychaleckyj |first283=Josyf C. |last284=Nadkarni |first284=Girish |last285=Naik |first285=Rakhi |last286=Nekhai |first286=Sergei |last287=Nelson |first287=Sarah C. |last288=Neltner |first288=Bonnie |last289=Nessner |first289=Caitlin |last290=Nickerson |first290=Deborah |last291=Nkechinyere |first291=Osuji |last292=North |first292=Kari |last293=O'Connor |first293=Tim |last294=Ochs-Balcom |first294=Heather |last295=Okwuonu |first295=Geoffrey |last296=Pack |first296=Allan |last297=Paik |first297=David T. |last298=Pankow |first298=James |last299=Papanicolaou |first299=George |last300=Parker |first300=Cora |last301=Peralta |first301=Juan Manuel |last302=Perez |first302=Marco |last303=Perry |first303=James |last304=Peters |first304=Ulrike |last305=Phillips |first305=Lawrence S. |last306=Pleiness |first306=Jacob |last307=Pollin |first307=Toni |last308=Post |first308=Wendy |last309=Becker |first309=Julia Powers |last310=Boorgula |first310=Meher Preethi |last311=Preuss |first311=Michael |last312=Qasba |first312=Pankaj |last313=Qiao |first313=Dandi |last314=Qin |first314=Zhaohui |last315=Rafaels |first315=Nicholas |last316=Rajendran |first316=Mahitha |last317=Rao |first317=D. C. |last318=Rasmussen-Torvik |first318=Laura |last319=Ratan |first319=Aakrosh |last320=Reed |first320=Robert |last321=Reeves |first321=Catherine |last322=Regan |first322=Elizabeth |last323=Robillard |first323=Rebecca |last324=Robine |first324=Nicolas |last325=Roden |first325=Dan |last326=Roselli |first326=Carolina |last327=Ruczinski |first327=Ingo |last328=Runnels |first328=Alexi |last329=Russell |first329=Pamela |last330=Ruuska |first330=Sarah |last331=Ryan |first331=Kathleen |last332=Sabino |first332=Ester Cerdeira |last333=Saleheen |first333=Danish |last334=Salimi |first334=Shabnam |last335=Salvi |first335=Sejal |last336=Salzberg |first336=Steven |last337=Sandow |first337=Kevin |last338=Sankaran |first338=Vijay G. |last339=Santibanez |first339=Jireh |last340=Schwander |first340=Karen |last341=Schwartz |first341=David |last342=Sciurba |first342=Frank |last343=Seidman |first343=Christine |last344=Seidman |first344=Jonathan |last345=Sériès |first345=Frédéric |last346=Sheehan |first346=Vivien |last347=Sherman |first347=Stephanie L. |last348=Shetty |first348=Amol |last349=Shetty |first349=Aniket |last350=Sheu |first350=Wayne Hui-Heng |last351=Shoemaker |first351=M. Benjamin |last352=Silver |first352=Brian |last353=Silverman |first353=Edwin |last354=Skomro |first354=Robert |last355=Smith |first355=Albert Vernon |last356=Smith |first356=Josh |last357=Smith |first357=Nicholas |last358=Smith |first358=Tanja |last359=Smoller |first359=Sylvia |last360=Snively |first360=Beverly |last361=Snyder |first361=Michael |last362=Sofer |first362=Tamar |last363=Sotoodehnia |first363=Nona |last364=Stilp |first364=Adrienne M. |last365=Storm |first365=Garrett |last366=Streeten |first366=Elizabeth |last367=Su |first367=Jessica Lasky |last368=Sung |first368=Yun Ju |last369=Sylvia |first369=Jody |last370=Szpiro |first370=Adam |last371=Taliun |first371=Daniel |last372=Tang |first372=Hua |last373=Taub |first373=Margaret |last374=Taylor |first374=Matthew |last375=Taylor |first375=Simeon |last376=Telen |first376=Marilyn |last377=Thornton |first377=Timothy A. |last378=Threlkeld |first378=Machiko |last379=Tinker |first379=Lesley |last380=Tirschwell |first380=David |last381=Tishkoff |first381=Sarah |last382=Tiwari |first382=Hemant |last383=Tong |first383=Catherine |last384=Tracy |first384=Russell |last385=Tsai |first385=Michael |last386=Vaidya |first386=Dhananjay |last387=Van Den Berg |first387=David |last388=VandeHaar |first388=Peter |last389=Vrieze |first389=Scott |last390=Walker |first390=Tarik |last391=Wallace |first391=Robert |last392=Walts |first392=Avram |last393=Wang |first393=Fei Fei |last394=Wang |first394=Heming |last395=Wang |first395=Jiongming |last396=Watson |first396=Karol |last397=Watt |first397=Jennifer |last398=Weinstock |first398=Joshua |last399=Weir |first399=Bruce |last400=Weiss |first400=Scott T. |last401=Weng |first401=Lu-Chen |last402=Wessel |first402=Jennifer |last403=Williams |first403=Kayleen |last404=Williams |first404=L. Keoki |last405=Wilson |first405=Carla |last406=Winterkorn |first406=Lara |last407=Wong |first407=Quenna |last408=Wu |first408=Joseph |last409=Xu |first409=Huichun |last410=Yang |first410=Ivana |last411=Yu |first411=Ketian |last412=Zekavat |first412=Seyedeh Maryam |last413=Zhang |first413=Yingze |last414=Zhao |first414=Snow Xueyan |last415=Zhu |first415=Xiaofeng |last416=Ziv |first416=Elad |last417=Zody |first417=Michael |last418=Zoellner |first418=Sebastian |last419=Atkinson |first419=Elizabeth |last420=Ballantyne |first420=Christie |last421=Bao |first421=Wei |last422=Bhattacharya |first422=Romit |last423=Bielak |first423=Larry |last424=Bis |first424=Joshua |last425=Bodea |first425=Corneliu |last426=Brody |first426=Jennifer |last427=Cade |first427=Brian |last428=Calvo |first428=Sarah |last429=Carlson |first429=Jenna |last430=Chang |first430=I-Shou |last431=Cho |first431=So Mi |last432=de Vries |first432=Paul |last433=Diallo |first433=Ana F. |last434=Do |first434=Ron |last435=Dron |first435=Jacqueline |last436=Elliott |first436=Amanda |last437=Finucane |first437=Hilary |last438=Floyd |first438=Caitlin |last439=Ganna |first439=Andrea |last440=Gong |first440=Dawei |last441=Graham |first441=Sarah |last442=Haas |first442=Mary |last443=Haring |first443=Bernhard |last444=Heemann |first444=Scott |last445=Himes |first445=Blanca |last446=Jarvik |first446=Gail |last447=Jiang |first447=Jicai |last448=Joehanes |first448=Roby |last449=Joseph |first449=Paule Valery |last450=Jun |first450=Goo |last451=Kalyani |first451=Rita |last452=Kanai |first452=Masahiro |last453=Kathiresan |first453=Sekar |last454=Khera |first454=Amit |last455=Khetarpal |first455=Sumeet |last456=Klarin |first456=Derek |last457=Koyama |first457=Satoshi |last458=Kral |first458=Brian |last459=Lange |first459=Leslie |last460=Lemaitre |first460=Rozenn |last461=Li |first461=Changwei |last462=Lu |first462=Yingchang |last463=Martin |first463=Lisa |last464=Mathias |first464=Rasika |last465=Mathur |first465=Ravi |last466=McGarvey |first466=Stephen |last467=McLenithan |first467=John |last468=Miller |first468=Amy |last469=Mootha |first469=Vamsi |last470=Moran |first470=Andrew |last471=Nakao |first471=Tetsushi |last472=O'Connell |first472=Jeff |last473=O'Donnell |first473=Christopher |last474=Palmer |first474=Nicholette |last475=Paruchuri |first475=Kaavya |last476=Patel |first476=Aniruddh |last477=Peloso |first477=Gina |last478=Pettinger |first478=Mary |last479=Peyser |first479=Patricia |last480=Pirruccello |first480=James |last481=Psaty |first481=Bruce |last482=Reiner |first482=Alex |last483=Rich |first483=Stephen |last484=Rosenthal |first484=Samantha |last485=Rotter |first485=Jerome |last486=Smith |first486=Jennifer |last487=Sunyaev |first487=Shamil R. |last488=Surakka |first488=Ida |last489=Sztalryd |first489=Carole |last490=Trinder |first490=Mark |last491=Uddin |first491=Md Mesbah |last492=Urbut |first492=Sarah |last493=Van Buren |first493=Eric |last494=Verbanck |first494=Marie |last495=Von Holle |first495=Ann |last496=Wang |first496=Yuxuan |last497=Wiggins |first497=Kerri |last498=Wilkins |first498=John |last499=Willer |first499=Cristen |last500=Wilson |first500=James |last501=Wolford |first501=Brooke |last502=Yanek |first502=Lisa |last503=Yu |first503=Zhi |last504=Zaghloul |first504=Norann |last505=Zhang |first505=Jingwen |last506=Zhou |first506=Ying |last507=Rotter |first507=Jerome I. |last508=Willer |first508=Cristen J. |last509=Natarajan |first509=Pradeep |last510=Peloso |first510=Gina M. |last511=Lin |first511=Xihong |display-authors=3|title=A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies |journal=Nature Methods |date=December 2022 |volume=19 |issue=12 |pages=1599–1611 |doi=10.1038/s41592-022-01640-x|pmid=36303018 |pmc=10008172 |s2cid=243873361 }}{{cite journal |title=STAARpipeline: an all-in-one rare-variant tool for biobank-scale whole-genome sequencing data |journal=Nature Methods |date=December 2022 |volume=19 |issue=12 |pages=1532–1533 |doi=10.1038/s41592-022-01641-w|pmid=36316564 |s2cid=253246835 }}

Meta-analysis of whole genome sequencing studies provides an attractive solution to the problem of collecting large sample sizes for discovering rare variants associated with complex phenotypes. Some methods have been developed to enable functionally informed rare variant association analysis in biobank-scale cohorts using efficient approaches for summary statistic storage.{{cite journal |last1=Li |first1=Xihao |last2=Quick |first2=Corbin |last3=Zhou |first3=Hufeng |last4=Gaynor |first4=Sheila M. |last5=Liu |first5=Yaowu |last6=Chen |first6=Han |last7=Selvaraj |first7=Margaret Sunitha |last8=Sun |first8=Ryan |last9=Dey |first9=Rounak |last10=Arnett |first10=Donna K.

|last56=NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium|last57=TOPMed Lipids Working Group

|last58=Rotter |first58=Jerome I. |last59=Natarajan |first59=Pradeep |last60=Peloso |first60=Gina M. |last61=Li |first61=Zilin |last62=Lin |first62=Xihong |title=Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies |journal=Nature Genetics |date=January 2023 |volume=55 |issue=1 |pages=154–164 |doi=10.1038/s41588-022-01225-6|pmid=36564505 |pmc=10084891 |s2cid=255084231 }}

=Oncology=

In this field, whole genome sequencing represents a great set of improvements and challenges to be faced by the scientific community, as it makes it possible to analyze, quantify and characterize circulating tumor DNA (ctDNA) in the bloodstream. This serves as a basis for early cancer diagnosis, treatment selection and relapse monitoring, as well as for determining the mechanisms of resistance, metastasis and phylogenetic patterns in the evolution of cancer. It can also help in the selection of individualized treatments for patients suffering from this pathology and observe how existing drugs are working during the progression of treatment. Deep whole genome sequencing involves a subclonal reconstruction based on ctDNA in plasma that allows for complete epigenomic and genomic profiling, showing the expression of circulating tumor DNA in each case.

{{Cite journal |last1=Herberts |first1=Cameron |last2=Annala |first2=Matti |last3=Sipola |first3=Joonatan |last4=Ng |first4=Sarah W. S. |last5=Chen |first5=Xinyi E. |last6=Nurminen |first6=Anssi |last7=Korhonen |first7=Olga V. |last8=Munzur |first8=Aslı D. |last9=Beja |first9=Kevin |last10=Schönlau |first10=Elena |last11=Bernales |first11=Cecily Q. |last12=Ritch |first12=Elie |last13=Bacon |first13=Jack V. W. |last14=Lack |first14=Nathan A. |last15=Nykter |first15=Matti |date= August 2022 |title=Deep whole-genome ctDNA chronology of treatment-resistant prostate cancer |url=https://www.nature.com/articles/s41586-022-04975-9 |journal=Nature |language=en |volume=608 |issue=7921 |pages=199–208 |doi=10.1038/s41586-022-04975-9 |pmid=35859180 |bibcode=2022Natur.608..199H |s2cid=250730778 |issn=1476-4687|url-access=subscription }}

=Newborn screening=

In 2013, Green and a team of researchers launched the BabySeq Project to study the ethical and medical consequences of sequencing a newborn's DNA.{{cite web|url=http://www.wbur.org/news/2013/09/05/sequencing-baby-dna |title=Boston Researchers To Sequence Newborn Babies' DNA |website=wbur.org |date=2013-09-05}}{{cite journal|title=The BabySeq project: implementing genomic sequencing in newborns |journal=BMC Pediatrics |date=2018-07-09 |doi=10.1186/s12887-018-1200-1|last1=Holm |first1=Ingrid A. |last2=Agrawal |first2=Pankaj B. |last3=Ceyhan-Birsoy |first3=Ozge |last4=Christensen |first4=Kurt D. |last5=Fayer |first5=Shawn |last6=Frankel |first6=Leslie A. |last7=Genetti |first7=Casie A. |last8=Krier |first8=Joel B. |last9=Lamay |first9=Rebecca C. |last10=Levy |first10=Harvey L. |last11=McGuire |first11=Amy L. |last12=Parad |first12=Richard B. |last13=Park |first13=Peter J. |last14=Pereira |first14=Stacey |last15=Rehm |first15=Heidi L. |last16=Schwartz |first16=Talia S. |last17=Waisbren |first17=Susan E. |last18=Yu |first18=Timothy W. |last19=Green |first19=Robert C. |last20=Beggs |first20=Alan H. |volume=18 |issue=1 |page=225 |pmid=29986673 |pmc=6038274 |doi-access=free }}

As of 2015, whole genome and exome sequencing as a newborn screening tool were deliberated{{Cite journal |last1=Howard |first1=Heidi Carmen |last2=Knoppers |first2=Bartha Maria |last3=Cornel |first3=Martina C. |last4=Wright Clayton |first4=Ellen |last5=Sénécal |first5=Karine |last6=Borry |first6=Pascal |date=2015-01-28 |title=Whole-genome sequencing in newborn screening? A statement on the continued importance of targeted approaches in newborn screening programmes |journal=European Journal of Human Genetics |language=en |volume=23 |issue=12 |pages=1593–1600 |doi=10.1038/ejhg.2014.289 |issn=1476-5438 |pmc=4795188 |pmid=25626707}} and in 2021, further discussed.{{Cite journal |last1=Woerner |first1=Audrey C. |last2=Gallagher |first2=Renata C. |last3=Vockley |first3=Jerry |last4=Adhikari |first4=Aashish N. |date=2021-07-19 |title=The Use of Whole Genome and Exome Sequencing for Newborn Screening: Challenges and Opportunities for Population Health |journal=Frontiers in Pediatrics |volume=9 |pages=663752 |doi=10.3389/fped.2021.663752 |doi-access=free |issn=2296-2360 |pmc=8326411 |pmid=34350142}}

In 2021, the NIH funded BabySeq2, an implementation study that expanded the BabySeq project, enrolling 500 infants from diverse families and track the effects of their genomic sequencing on their pediatric care.{{cite journal|url=https://jamanetwork.com/journals/jamapediatrics/article-abstract/2783320 |title=The Effect of BabySeq on Pediatric and Genomic Research—More Than Baby Steps |journal=JAMA Pediatrics |date=2021-08-23 |doi=10.1001/jamapediatrics.2021.2826|last1=Tarini |first1=Beth A. |volume=175 |issue=11 |pages=1107–1108 |pmid=34424259 |s2cid=237267536 |url-access=subscription }}

In 2023, the Lancet opined that in the UK "focusing on improving screening by upgrading targeted gene panels might be more sensible in the short term. Whole genome sequencing in the long term deserves thorough examination and universal caution."{{Cite journal |author=The Lancet |date=2023-07-22 |title=Genomic newborn screening: current concerns and challenges |url=https://doi.org/10.1016/S0140-6736(23)01513-1 |journal=The Lancet |volume=402 |issue=10398 |pages=265 |doi=10.1016/s0140-6736(23)01513-1 |pmid=37481265 |issn=0140-6736|url-access=subscription }}

Ethical concerns

The introduction of whole genome sequencing may have ethical implications.{{cite journal|last=Sijmons|first=R.H.|author2=Van Langen, I.M|title=A clinical perspective on ethical issues in genetic testing|journal=Accountability in Research: Policies and Quality Assurance|date=2011|volume=18|issue=3|pages=148–162|doi=10.1080/08989621.2011.575033|pmid=21574071|bibcode=2013ARPQ...20..143D|s2cid=24935558}} On one hand, genetic testing can potentially diagnose preventable diseases, both in the individual undergoing genetic testing and in their relatives. On the other hand, genetic testing has potential downsides such as genetic discrimination, loss of anonymity, and psychological impacts such as discovery of non-paternity.{{cite arXiv|author=Ayday E |author2=De Cristofaro E |author3=Hubaux JP |author4=Tsudik G|title=The Chills and Thrills of Whole Genome Sequencing |date=2015 |eprint=1306.1264 |class=cs.CR }}

Some ethicists insist that the privacy of individuals undergoing genetic testing must be protected, and is of particular concern when minors undergo genetic testing.{{cite journal | doi = 10.1038/ejhg.2009.25 | last1 = Borry | first1 = Pascal | last2 = Evers-Kiebooms | first2 = Gerry | last3 = Cornel | date = 2009 | first3 = Martha C. | last4 = Clarke | first4 = Angus | last5 = Dierickx | first5 = Kris | title = Genetic testing in asymptomatic minors Background considerations towards ESHG Recommendations | journal = European Journal of Human Genetics | volume = 17 | issue = 6| pages = 711–9 | pmid = 19277061 | pmc = 2947094}} Illumina's CEO, Jay Flatley, wrongly claimed in February 2009 that "by 2019 it will have become routine to map infants' genes when they are born".{{cite news |last=Henderson |first=Mark |url=http://www.timesonline.co.uk/tol/news/uk/science/article5689052.ece |title=Genetic mapping of babies by 2019 will transform preventive medicine |publisher=Times Online |date=2009-02-09 |access-date=2009-02-23 |location=London |url-status=dead |archive-url=https://web.archive.org/web/20090511075525/http://www.timesonline.co.uk/tol/news/uk/science/article5689052.ece |archive-date=2009-05-11 }} This potential use of genome sequencing is highly controversial, as it runs counter to established ethical norms for predictive genetic testing of asymptomatic minors that have been well established in the fields of medical genetics and genetic counseling.{{cite journal |author=McCabe LL |author2=McCabe ER |title=Postgenomic medicine. Presymptomatic testing for prediction and prevention |journal=Clin Perinatol |volume=28 |issue=2 |pages=425–34 |date=June 2001 |pmid=11499063 |doi= 10.1016/S0095-5108(05)70094-4}}{{cite journal |author=Nelson RM |author2=Botkjin JR |author3=Kodish ED |title=Ethical issues with genetic testing in pediatrics |journal=Pediatrics |volume=107 |issue=6 |pages=1451–5 |date=June 2001 |pmid=11389275 |doi= 10.1542/peds.107.6.1451|display-authors=etal|doi-access= |s2cid=9993840 }}{{cite journal |author=Borry P |author2=Fryns JP |author3=Schotsmans P |author4=Dierickx K |title=Carrier testing in minors: a systematic review of guidelines and position papers |journal=Eur. J. Hum. Genet. |volume=14 |issue=2 |pages=133–8 |date=February 2006 |pmid=16267502 |doi=10.1038/sj.ejhg.5201509 |doi-access=free }}{{cite journal |author=Borry P |author2=Stultiens L |author3=Nys H |author4=Cassiman JJ |author5=Dierickx K |title=Presymptomatic and predictive genetic testing in minors: a systematic review of guidelines and position papers |journal=Clin. Genet. |volume=70 |issue=5 |pages=374–81 |date=November 2006 |pmid=17026616 |doi=10.1111/j.1399-0004.2006.00692.x |s2cid=7066285 |url=https://lirias.kuleuven.be/handle/123456789/246877 |url-access=subscription }} The traditional guidelines for genetic testing have been developed over the course of several decades since it first became possible to test for genetic markers associated with disease, prior to the advent of cost-effective, comprehensive genetic screening.{{cn|date=March 2024}}

When an individual undergoes whole genome sequencing, they reveal information about not only their own DNA sequences, but also about probable DNA sequences of their close genetic relatives. This information can further reveal useful predictive information about relatives' present and future health risks.{{cite journal|last=McGuire|first=Amy, L|author2=Caulfield, Timothy |title=Science and Society: Research ethics and the challenge of whole-genome sequencing|journal=Nature Reviews Genetics|date=2008|volume=9|issue=2|pages=152–156|doi=10.1038/nrg2302|pmid=18087293|pmc=2225443}} Hence, there are important questions about what obligations, if any, are owed to the family members of the individuals who are undergoing genetic testing. In Western/European society, tested individuals are usually encouraged to share important information on any genetic diagnoses with their close relatives, since the importance of the genetic diagnosis for offspring and other close relatives is usually one of the reasons for seeking a genetic testing in the first place. Nevertheless, a major ethical dilemma can develop when the patients refuse to share information on a diagnosis that is made for serious genetic disorder that is highly preventable and where there is a high risk to relatives carrying the same disease mutation. Under such circumstances, the clinician may suspect that the relatives would rather know of the diagnosis and hence the clinician can face a conflict of interest with respect to patient-doctor confidentiality.

Privacy concerns can also arise when whole genome sequencing is used in scientific research studies. Researchers often need to put information on patient's genotypes and phenotypes into public scientific databases, such as locus specific databases. Although only anonymous patient data are submitted to locus specific databases, patients might still be identifiable by their relatives in the case of finding a rare disease or a rare missense mutation. Public discussion around the introduction of advanced forensic techniques (such as advanced familial searching using public DNA ancestry websites and DNA phenotyping approaches) has been limited, disjointed, and unfocused. As forensic genetics and medical genetics converge toward genome sequencing, issues surrounding genetic data become increasingly connected, and additional legal protections may need to be established.{{Cite journal|last1=Curtis|first1=Caitlin|last2=Hereward|first2=James|last3=Mangelsdorf|first3=Marie|last4=Hussey|first4=Karen|last5=Devereux|first5=John|date=18 December 2018|title=Protecting trust in medical genetics in the new era of forensics|journal=Genetics in Medicine|volume=21|issue=7|pages=1483–1485|doi=10.1038/s41436-018-0396-7|pmid=30559376|pmc=6752261}}

Public human genome sequences

= First people with public genome sequences =

The first nearly complete human genomes sequenced were two Americans of predominantly Northwestern European ancestry in 2007 (J. Craig Venter at 7.5-fold coverage,{{cite news|url=https://www.nytimes.com/2007/09/04/science/04vent.html|title=In the Genome Race, the Sequel Is Personal|last=Wade|first=Nicholas|date=September 4, 2007|work=The New York Times|access-date=February 22, 2009|url-status=live|archive-url=https://web.archive.org/web/20090411102805/http://www.nytimes.com/2007/09/04/science/04vent.html|archive-date=April 11, 2009}}{{cite journal|title=Access : All about Craig: the first 'full' genome sequence|journal=Nature|doi=10.1038/449006a|pmid=17805257|volume=449|issue=7158|pages=6–7|bibcode=2007Natur.449....6L|year=2007|last1=Ledford|first1=Heidi|doi-access=free}}{{cite journal|date=September 2007|title=The diploid genome sequence of an individual human|journal=PLOS Biol.|volume=5|issue=10|pages=e254|doi=10.1371/journal.pbio.0050254 |doi-access=free |pmc=1964779|pmid=17803354|vauthors=Levy S, Sutton G, Ng PC, Feuk L, Halpern AL, Walenz BP, Axelrod N, Huang J, Kirkness EF, Denisov G, Lin Y, MacDonald JR, Pang AW, Shago M, Stockwell TB, Tsiamouri A, Bafna V, Bansal V, Kravitz SA, Busam DA, Beeson KY, McIntosh TC, Remington KA, Abril JF, Gill J, Borman J, Rogers YH, Frazier ME, Scherer SW, Strausberg RL, Venter JC}} and James Watson at 7.4-fold).{{cite web|url=http://www.iht.com/articles/2007/06/01/america/dna.php|title=DNA pioneer Watson gets own genome map|last=Wade|first=Wade|date=June 1, 2007|publisher=International Herald Tribune|access-date=February 22, 2009|url-status=dead|archive-url=https://web.archive.org/web/20080927034838/http://www.iht.com/articles/2007/06/01/america/dna.php|archive-date=September 27, 2008}}{{cite news|url=https://www.nytimes.com/2007/05/31/science/31cnd-gene.html|title=Genome of DNA Pioneer Is Deciphered|last=Wade|first=Nicholas|date=May 31, 2007|work=The New York Times|access-date=February 21, 2009|url-status=live|archive-url=https://web.archive.org/web/20110620031137/http://www.nytimes.com/2007/05/31/science/31cnd-gene.html|archive-date=June 20, 2011}}{{cite journal|date=2008|title=The complete genome of an individual by massively parallel DNA sequencing|journal=Nature|volume=452|issue=7189|pages=872–6|bibcode=2008Natur.452..872W|doi=10.1038/nature06884|pmid=18421352|author=Wheeler DA|author2=Srinivasan M|author3=Egholm M|author4=Shen Y|author5=Chen L|author6=McGuire A|author7=He W|author8=Chen YJ|author9=Makhijani V|author10=Roth GT|author11=Gomes X|author12=Tartaro K|author13=Niazi F|author14=Turcotte CL|author15=Irzyk GP|author16=Lupski JR|author17=Chinault C|author18=Song XZ|author19=Liu Y|author20=Yuan Y|author21=Nazareth L|author22=Qin X|author23=Muzny DM|author24=Margulies M|author25=Weinstock GM|author26=Gibbs RA|author27=Rothberg JM|doi-access=free}} This was followed in 2008 by sequencing of an anonymous Han Chinese man (at 36-fold),{{cite journal|last2=Wang|first2=Wei|last3=Li|first3=Ruiqiang|last4=Li|first4=Yingrui|last5=Tian|first5=Geng|last6=Goodman|first6=Laurie|last7=Fan|first7=Wei|last8=Zhang|first8=Junqing|last9=Li|first9=Jun|date=2008|title=The diploid genome sequence of an Asian individual|journal=Nature|volume=456|issue=7218|pages=60–65|bibcode=2008Natur.456...60W|doi=10.1038/nature07484|pmc=2716080|pmid=18987735|author=Wang J|first10=Juanbin|first11=Yiran|first12=Binxiao|first13=Heng|first14=Yao|first15=Xiaodong|first16=Huiqing|first17=Zhenglin|first18=Dong|first19=Yiqing|first20=Yujie|first21=Zhenzhen|first22=Hancheng|first23=Ines|first24=Michael|first25=John|first26=Xin|first27=Jing|first28=Jinjie|first29=Yan|first30=Junjie|last10=Zhang, Juanbin|last11=Guo, Yiran|last12=Feng, Binxiao|last13=Li, Heng|last14=Lu, Yao|last15=Fang, Xiaodong|last16=Liang, Huiqing|last17=Du, Zhenglin|last18=Li, Dong|last19=Zhao, Yiqing|last20=Hu, Yujie|last21=Yang, Zhenzhen|last22=Zheng, Hancheng|last23=Hellmann, Ines|last24=Inouye, Michael|last25=Pool, John|last26=Yi, Xin|last27=Zhao, Jing|last28=Duan, Jinjie|last29=Zhou, Yan|last30=Qin, Junjie|display-authors=29}} a Yoruban man from Nigeria (at 30-fold),{{cite journal|date=2008|title=Accurate whole human genome sequencing using reversible terminator chemistry|journal=Nature|volume=456|issue=7218|pages=53–9|bibcode=2008Natur.456...53B|doi=10.1038/nature07517|pmc=2581791|pmid=18987734|author=Bentley DR|author2=Balasubramanian S|display-authors=etal}} a female clinical geneticist (Marjolein Kriek) from the Netherlands (at 7 to 8-fold), and a female leukemia patient in her mid-50s (at 33 and 14-fold coverage for tumor and normal tissues).{{cite journal|date=2008|title=DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome|journal=Nature|volume=456|issue=7218|pages=66–72|bibcode=2008Natur.456...66L|doi=10.1038/nature07485|pmc=2603574|pmid=18987736|author=Ley TJ|author2=Mardis ER|author3=Ding L|author4=Fulton B|author5=McLellan MD|author6=Chen K|author7=Dooling D|author8=Dunford-Shore BH|author9=McGrath S|author10=Hickenbotham M|author11=Cook L|author12=Abbott R|author13=Larson DE|author14=Koboldt DC|author15=Pohl C|author16=Smith S|author17=Hawkins A|author18=Abbott S|author19=Locke D|author20=Hillier LW|author21=Miner T|author22=Fulton L|author23=Magrini V|author24=Wylie T|author25=Glasscock J|author26=Conyers J|author27=Sander N|author28=Shi X|author29=Osborne JR|author30=Minx P|author31=Gordon D|author32=Chinwalla A|author33=Zhao Y|author34=Ries RE|author35=Payton JE|author36=Westervelt P|author37=Tomasson MH|author38=Watson M|author39=Baty J|author40=Ivanovich J|author41=Heath S|author42=Shannon WD|author43=Nagarajan R|author44=Walter MJ|author45=Link DC|author46=Graubert TA|author47=DiPersio JF|author48=Wilson RK|display-authors=29}} Steve Jobs was among the first 20 people to have their whole genome sequenced, reportedly for the cost of $100,000.{{cite news|url=https://www.nytimes.com/2011/10/21/technology/book-offers-new-details-of-jobs-cancer-fight.html|title=New Book Details Jobs's Fight Against Cancer|last=Lohr|first=Steve|date=2011-10-20|work=The New York Times|url-status=live|archive-url=https://web.archive.org/web/20170928010308/http://www.nytimes.com/2011/10/21/technology/book-offers-new-details-of-jobs-cancer-fight.html?_r=1&hp|archive-date=2017-09-28}} {{As of|2012|06}}, there were 69 nearly complete human genomes publicly available.{{cite web|url=http://www.completegenomics.com/news-events/press-releases/archive/Complete-Genomics-Adds-29-High-Coverage-Complete-Human-Genome-Sequencing-Datasets-to-its-Public-Genomic-Repository--119298369.html|title=Complete Human Genome Sequencing Datasets to its Public Genomic Repository|url-status=dead|archive-url=https://web.archive.org/web/20120610192353/http://www.completegenomics.com/news-events/press-releases/archive/Complete-Genomics-Adds-29-High-Coverage-Complete-Human-Genome-Sequencing-Datasets-to-its-Public-Genomic-Repository--119298369.html|archive-date=June 10, 2012}} In November 2013, a Spanish family made their personal genomics data publicly available under a Creative Commons public domain license. The work was led by Manuel Corpas and the data obtained by direct-to-consumer genetic testing with 23andMe and the Beijing Genomics Institute. This is believed to be the first such Public Genomics dataset for a whole family.{{cite bioRxiv | last1 = Corpas | first1 = Manuel | last2 = Cariaso | first2 = Mike | last3 = Coletta | first3 = Alain | last4 = Weiss | first4 = David | last5 = Harrison | first5 = Andrew P | last6 = Moran | first6 = Federico | last7 = Yang | first7 = Huanming | title = A Complete Public Domain Family Genomics Dataset | date = November 12, 2013 | biorxiv = 10.1101/000216 | name-list-style = vanc }}

= Databases =

According to Science, the major databases of whole genomes are:

Biobank	Completed whole genomes	Release/access information
class="wikitable"
UK Biobank	500,000	Made available through a Web platform in November 2021, it is the largest public dataset of whole genomes. The genomes are linked to anonymized medical information and are made more accessible for biomedical research than prior, less comprehensive datasets. 300,000 more genomes were released in early 2023.{{cite news \|title=200,000 whole genomes made available for biomedical studies by U.K. effort \|url=https://www.science.org/content/article/200-000-whole-genomes-made-available-biomedical-studies-uk-effort \|access-date=11 December 2021 \|work=www.science.org \|language=en}}{{cite web \|title=Whole Genome Sequencing data on 200,000 UK Biobank participants available now \|url=https://www.ukbiobank.ac.uk/learn-more-about-uk-biobank/news/whole-genome-sequencing-data-on-200-000-uk-biobank-participants-available-now \|website=www.ukbiobank.ac.uk \|date=17 November 2021 \|access-date=11 December 2021}}{{Cite journal \|last=Callaway \|first=Ewen \|date=2023-11-30 \|title=World's biggest set of human genome sequences opens to scientists \|url=https://www.nature.com/articles/d41586-023-03763-3 \|journal=Nature \|language=en \|volume=624 \|issue=7990 \|pages=16–17 \|doi=10.1038/d41586-023-03763-3\|pmid=38036674 \|url-access=subscription }}
Trans-Omics for Precision Medicine	161,000	National Institutes of Health (NIH) requires project-specific consent
Million Veteran Program	125,000	Non–Veterans Affairs researchers get access in 2022
Genomics England's 100,000 Genomes	120,000	Researchers must join collaboration
All of Us	90,000	NIH expects to release by early 2022

Genomic coverage

In terms of genomic coverage and accuracy, whole genome sequencing can broadly be classified into either of the following:{{cite web|url=https://www.genome.gov/about-genomics/fact-sheets/Sequencing-Human-Genome-cost|title=The Cost of Sequencing a Human Genome|website=National Human Genome Research Institute|author=Kris A. Wetterstrand, M.S.}} Last updated: November 1, 2021

A draft sequence, covering approximately 90% of the genome at approximately 99.9% accuracy
A finished sequence, covering more than 95% of the genome at approximately 99.99% accuracy

Producing a truly high-quality finished sequence by this definition is very expensive. Thus, most human "whole genome sequencing" results are draft sequences (sometimes above and sometimes below the accuracy defined above).

References

External links

[https://web.archive.org/web/20081203165136/http://jimwatsonsequence.cshl.edu/cgi-perl/gbrowse/jwsequence/ James Watson's Personal Genome Sequence]
[https://web.archive.org/web/20100313132651/http://www.sciencemag.org/products/posters/SequencingPoster.pdf AAAS/Science: Genome Sequencing Poster]

Category:Molecular biology

Category:DNA sequencing

Category:Biotechnology

Category:Genomics

Category:Bioinformatics

Category:DNA

Category:Gene tests

Category:Molecular genetics