massive parallel sequencing

{{Short description|DNA sequencing using the concept of massively parallel processing}}

Massive parallel sequencing or massively parallel sequencing is any of several high-throughput approaches to DNA sequencing using the concept of massively parallel processing; it is also called next-generation sequencing (NGS) or second-generation sequencing. Some of these technologies emerged between 1993 and 1998 {{Cite journal |last1=Nyren |first1=P. |last2=Pettersson |first2=B. |last3=Uhlen |first3=M. |date=January 1993 |title=Solid Phase DNA Minisequencing by an Enzymatic Luminometric Inorganic Pyrophosphate Detection Assay |url=https://linkinghub.elsevier.com/retrieve/pii/S0003269783710249 |journal=Analytical Biochemistry |language=en |volume=208 |issue=1 |pages=171–175 |doi=10.1006/abio.1993.1024|pmid=8382019 |url-access=subscription }}{{cite journal |name-list-style=amp |vauthors=Ronaghi M, Karamohamed S, Pettersson B, Uhlén M, Nyrén P |date=November 1996 |title=Real-time DNA sequencing using detection of pyrophosphate release |journal=Analytical Biochemistry |volume=242 |issue=1 |pages=84–89 |doi=10.1006/abio.1996.0432 |pmid=8923969}}{{cite patent|country=EP|number=0972081|title=Method of nucleic acid amplification|pubdate=2007-06-13|assign1=Solexa Ltd. | inventor = Farinelli L, Kawashima E, Mayer P )}}{{cite patent|country=EP|number=0975802|title=Method of nucleic acid sequencing|pubdate=2004-06-23 | inventor = Kawashima E, Farinellit L, Mayer P }} and have been commercially available since 2005. These technologies use miniaturized and parallelized platforms for sequencing of 1 million to 43 billion short reads (50 to 400 bases each) per instrument run.

Many NGS platforms differ in engineering configurations and sequencing chemistry. They share the technical paradigm of massive parallel sequencing via spatially separated, clonally amplified DNA templates or single DNA molecules in a flow cell. This design is very different from that of Sanger sequencing—also known as capillary sequencing or first-generation sequencing—which is based on electrophoretic separation of chain-termination products produced in individual sequencing reactions.{{cite journal | vauthors = Voelkerding KV, Dames SA, Durtschi JD | title = Next-generation sequencing: from basic research to diagnostics | journal = Clinical Chemistry | volume = 55 | issue = 4 | pages = 641–658 | date = April 2009 | pmid = 19246620 | doi = 10.1373/clinchem.2008.112789 | name-list-style = amp | doi-access = free }} This methodology allows sequencing to be completed on a larger scale.{{cite journal | vauthors = Ballard D, Winkler-Galicki J, Wesoły J | title = Massive parallel sequencing in forensics: advantages, issues, technicalities, and prospects | journal = International Journal of Legal Medicine | volume = 134 | issue = 4 | pages = 1291–1303 | date = July 2020 | pmid = 32451905 | pmc = 7295846 | doi = 10.1007/s00414-020-02294-0 }}

History

In the 1990s, Applied Biosystems dominated DNA sequencing technology with their automated capillary electrophoresis Sanger sequencing machines. However, the early 2000s saw many new companies entering the market, driven by the goal of reducing genome sequencing costs below $1000 following the enthusiasm generated by the Human Genome Project.{{cite journal | vauthors = Giani AM, Gallo GR, Gianfranceschi L, Formenti G | title = Long walk to genomics: History and current approaches to genome sequencing and assembly | journal = Computational and Structural Biotechnology Journal | volume = 18 | pages = 9–19 | date = 2020 | doi = 10.1016/j.csbj.2019.11.002 | pmid = 31890139 | pmc = 6926122 }} Many of these new methods were first developed with support from the National Institutes of Health (NIH) funding under the 'Technology Development for the $1,000 Genome' program,{{Cite web

| title= Program Announcement Concept: Technology Development for the $1000 Genome

| url= http://www.genome.gov/11008124#al-4

| access-date= April 1, 2025

| archive-url= https://web.archive.org/web/20250301060733/https://www.genome.gov/11008124/concept-papers-for-two-DNA-sequencing-technology-development-programs-2003#al-4

| archive-date= March 1, 2025

}} launched during Francis Collins’ tenure as director of the National Human Genome Research Institute.{{cite journal | last= Mardis|first= E. | title= A decade’s perspective on DNA sequencing technology |journal= Nature | volume = 470 | pages = 198–203 | date = 2011 | doi=10.1038/nature09796 }}

The first next-generation sequencers were based on pyrosequencing, originally developed by Pyrosequencing AB and later commercialized by 454 Life Sciences. In 2003, 454 Life Sciences launched the GS20, the first NGS DNA sequencer. This system provided reads approximately 400–500 bp long with 99% accuracy, enabling sequencing of about 25 million bases in a four-hour run at significantly lower costs compared to Sanger sequencing.{{cite journal | vauthors = Margulies M, Egholm M, Altman W et al. | title = Genome sequencing in microfabricated high-density picolitre reactors | journal = Nature | volume = 437 | pages = 376–380 | date = 2005 | issue = 7057 | doi = 10.1038/nature03959 | pmid = 16056220 | bibcode = 2005Natur.437..376M | pmc = 1464427 }} The sequencing machines developed by 454 represented a paradigm shift by enabling the mass parallelisation of sequencing reactions, which significantly boosted the amount of DNA sequenced per run, making 454 Life Sciences the first major success in commercial NGS technology.{{cite journal | first1=James M. |last1=Heather|first2=Benjamin |last2=Chain | title = The sequence of sequencers: The history of sequencing DNA | journal = Genomics | volume = 107 | issue=1 | pages = 1–8 | date = 2016 | doi = 10.1016/j.ygeno.2015.11.003 |pmid=26554401 |pmc=4727787 }}

Also in 2003, Solexa began developing a competing method known as Sequencing by Synthesis (SBS). In 2004, Solexa acquired colony sequencing (bridge amplification) technology from Manteia, producing densely clustered DNA fragments ("polonies") immobilized on flow cells. These dense clusters generated stronger fluorescent signals, improving accuracy and reducing optical costs. In 2005, Solexa integrated an engineered DNA polymerase and reversible terminator nucleotides, allowing repeated cycles of sequencing and imaging. The first commercial sequencer based on this technology, Genome Analyzer, was launched in 2006, providing shorter reads (about 35 bp) but higher throughput (up to 1 Gbp per run) and paired-end sequencing capability (i.e. both DNA strands were sequenced simultaneously).{{cite journal | vauthors = Bentley D, Balasubramanian S, Swerdlow H et al. | title = Accurate whole human genome sequencing using reversible terminator chemistry | journal = Nature | volume = 456 | pages = 53–59 | date = 2008 | issue = 7218 | doi = 10.1038/nature07517 | pmid = 18987734 | bibcode = 2008Natur.456...53B | pmc = 2581791 }}

in 2007, 454 Life Sciences was acquired by Roche{{cite journal |last=Check Hayden | first=E. | title=Roche chases stake in medical sequencing | journal = Nature | volume = 484 | pages = 152 | date = 2012 | doi = 10.1038/484152a }} and Solexa by Illumina,{{cite journal |first=Shankar | last=Balasubramanian| title=Solexa Sequencing: Decoding Genomes on a Population Scale | journal = Clinical Chemistry | volume = 61 | issue = 1 | date = 2015 | pages = 21–24 | doi = 10.1373/clinchem.2014.221747 }} the same year Applied Biosystems introduced SOLiD, a ligation-based sequencing platform.{{cite journal | vauthors = McKernan KJ, Peckham HE, Costa GL et al. | title = Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding | journal = Genome Res. | volume = 19 | pages = 1527–1541 | date = 2009 | issue = 9 | doi = 10.1101/gr.091868.109 | pmid = 19546169 | pmc = 2752135 }} However, SOLiD encountered issues sequencing palindromic regions{{cite journal | vauthors = Huang YF, Chen SC, Chiang YS et al. | title = Palindromic sequence impedes sequencing-by-ligation mechanism | journal = BMC Syst Biol | volume = 6 |issue=Suppl 2 | pages = S10 | date = 2012 | doi = 10.1186/1752-0509-6-S2-S10 | doi-access = free | pmid = 23281822 | pmc = 3521181 }} and was eventually discontinued. In 2011, Ion Torrent introduced another alternative, measuring proton (pH) changes during nucleotide incorporation using semiconductor-based sensors.{{cite journal | vauthors = Rothberg JM, Hinz H, Rearick TM et al. | title = An integrated semiconductor device enabling non-optical genome sequencing | journal = Nature | volume = 475 | pages = 348–352 | date = 2011 | issue = 7356 | doi = 10.1038/nature10242 | pmid = 21776081 | doi-access = free }} Ion Torrent systems rapidly produced 100 bp reads but frequently struggled with accurately sequencing homopolymers,{{cite journal | vauthors = Loman NJ, Misra RV, Dallman TJ et al. | title = Performance comparison of benchtop high-throughput sequencing platforms | journal = Nat Biotechnol | volume = 30 | pages = 434–439 | date = 2012 | issue = 5 | doi = 10.1038/nbt.2198 | pmid = 22522955 }} ultimately leading to their abandonment.

Due to limitations in competing methods, Illumina’s SBS technology eventually dominated the sequencing market. By 2012, expectations that 454 would gain a substantial share of the sequencing market had not been realized, and Roche’s 2007 acquisition was increasingly viewed as underperforming; that same year, Roche made an unsuccessful attempt to acquire Illumina. In October 2013, Roche announced that it would shut down 454, and stop supporting the platform by mid-2016.{{cite web|url=http://www.genomeweb.com/sequencing/following-roches-decision-shut-down-454-customers-make-plans-move-other-platform|title=Following Roche's Decision to Shut Down 454, Customers Make Plans to Move to Other Platforms|date=October 22, 2013 }} By 2014, Illumina controlled approximately 70% of DNA sequencer sales and generated over 90% of sequencing data.{{cite news|last1=Zimmerman|first1=Eilene|title=50 Smartest Companies: Illumina|url=http://www.technologyreview.com/featuredstory/524531/why-illumina-is-no-1/|access-date=25 August 2014|work=MIT Technology Review|publisher=Massachusetts Institute of Technology|date=18 February 2014|archive-date=11 December 2015|archive-url=https://web.archive.org/web/20151211020908/http://www.technologyreview.com/featuredstory/524531/why-illumina-is-no-1/|url-status=dead}}{{cite web|last1=Regalado|first1=Antonio|title=EmTech: Illumina Says 228,000 Human Genomes Will Be Sequenced This Year|url=http://www.technologyreview.com/news/531091/emtech-illumina-says-228000-human-genomes-will-be-sequenced-this-year/|website=MIT Technology Review|publisher=Massachusetts Institute of Technology|access-date=26 September 2014|archive-date=26 December 2015|archive-url=https://web.archive.org/web/20151226182622/http://www.technologyreview.com/news/531091/emtech-illumina-says-228000-human-genomes-will-be-sequenced-this-year/|url-status=dead}} That year, Illumina introduced the HiSeq X Ten platform, significantly increasing throughput and claiming the long-targeted goal of sequencing human genomes at roughly $1000 each.{{cite magazine |last=Clark |first=Liat |date=15 Jan 2014 |title=Illumina announces landmark $1,000 human genome sequencing |url=https://www.wired.co.uk/article/1000-dollar-genome |magazine=Wired |access-date=4 Nov 2019 |archive-date=4 November 2019 |archive-url=https://web.archive.org/web/20191104155904/https://www.wired.co.uk/article/1000-dollar-genome |url-status=live }} Illumina surpassed this milestone in 2017 with the release of NovaSeq, a system capable of generating over 3000 Gbp per run.

NGS platforms

DNA sequencing with commercially available NGS platforms is generally conducted with the following steps. First, DNA sequencing libraries are generated by clonal amplification by PCR in vitro. Second, the DNA is sequenced by synthesis, such that the DNA sequence is determined by the addition of nucleotides to the complementary strand rather than through chain-termination chemistry. Third, the spatially segregated, amplified DNA templates are sequenced simultaneously in a massively parallel fashion without the requirement for a physical separation step. These steps are followed in most NGS platforms, but each utilizes a different strategy.{{cite journal | vauthors = Anderson MW, Schrijver I | title = Next generation DNA sequencing and the future of genomic medicine | journal = Genes | volume = 1 | issue = 1 | pages = 38–69 | date = May 2010 | pmid = 24710010 | pmc = 3960862 | doi = 10.3390/genes1010038 | doi-access = free }}

NGS parallelization of the sequencing reactions generates hundreds of megabases to gigabases of nucleotide sequence reads in a single instrument run. This has enabled a drastic increase in available sequence data and fundamentally changed genome sequencing approaches in the biomedical sciences.{{cite journal | vauthors = Tucker T, Marra M, Friedman JM | title = Massively parallel sequencing: the next big thing in genetic medicine | journal = American Journal of Human Genetics | volume = 85 | issue = 2 | pages = 142–154 | date = August 2009 | pmid = 19679224 | pmc = 2725244 | doi = 10.1016/j.ajhg.2009.06.022 | name-list-style = amp }}

Newly emerging NGS technologies and instruments have further contributed to a significant decrease in the cost of sequencing nearing the mark of $1000 per genome sequencing.{{cite journal | vauthors = von Bubnoff A | title = Next-generation sequencing: the race is on | journal = Cell | volume = 132 | issue = 5 | pages = 721–723 | date = March 2008 | pmid = 18329356 | doi = 10.1016/j.cell.2008.02.028 | s2cid = 8413828 | doi-access = free }}{{cite web|url=http://www.genome.gov/27527585 |title=2008 Release: NHGRI Seeks DNA Sequencing Technologies Fit for Routine Laboratory and Medical Use |publisher=Genome.gov |access-date=2012-08-05}}

As of 2014, massively parallel sequencing platforms are commercially available and their features are summarized in the table. As the pace of NGS technologies is advancing rapidly, technical specifications and pricing are in flux.

File:HiSeq 2000.JPG HiSeq 2000 sequencing machine]]

class="wikitable" style="text-align:center;"

|+ NGS platforms

style="width:10em" | Platform

! style="width:10em" | Template preparation

! style="width:10em" | Chemistry

! style="width:10em" | Max read length (bases)

! style="width:10em" | Run times (days)

! style="width:10em" | Max Gb per Run

Roche 454

| Clonal-emPCR

| Pyrosequencing

| 400‡

| 0.42

| 0.40-0.60

GS FLX Titanium

| Clonal-emPCR

| Pyrosequencing

| 400‡

| 0.42

| 0.035

Illumina MiSeq

| Clonal Bridge Amplification

| Reversible Dye Terminator

| 2x300

| 0.17-2.7

| 15

Illumina HiSeq

| Clonal Bridge Amplification

| Reversible Dye Terminator

| 2x150

| 0.3-11{{Cite web |url=http://systems.illumina.com/systems/hiseq_2500_1500/performance_specifications.html |title=Specifications for HiSeq 2500 |access-date=2014-11-06 |archive-url=https://web.archive.org/web/20141206044410/http://systems.illumina.com/systems/hiseq_2500_1500/performance_specifications.html |archive-date=2014-12-06 |url-status=dead }}

| 1000{{cite web |url=http://genomics.ed.ac.uk/blog/hiseq-v4-here-and-it-delivers |title=HiSeq v4 is here… and it delivers | Edinburgh Genomics |access-date=2014-11-06 |url-status=dead |archive-url=https://web.archive.org/web/20141106114253/http://genomics.ed.ac.uk/blog/hiseq-v4-here-and-it-delivers |archive-date=2014-11-06 }}

Illumina Genome Analyzer IIX

| Clonal Bridge Amplification

| Reversible Dye Terminator

| 2x150

| 2-14

| 95

Life Technologies SOLiD4

| Clonal-emPCR

| Oligonucleotide 8-mer Chained Ligation{{cite journal | vauthors = McKernan KJ, Peckham HE, Costa GL, McLaughlin SF, Fu Y, Tsung EF, Clouser CR, Duncan C, Ichikawa JK, Lee CC, Zhang Z, Ranade SS, Dimalanta ET, Hyland FC, Sokolsky TD, Zhang L, Sheridan A, Fu H, Hendrickson CL, Li B, Kotler L, Stuart JR, Malek JA, Manning JM, Antipova AA, Perez DS, Moore MP, Hayashibara KC, Lyons MR, Beaudoin RE, Coleman BE, Laptewicz MW, Sannicandro AE, Rhodes MD, Gottimukkala RK, Yang S, Bafna V, Bashir A, MacBride A, Alkan C, Kidd JM, Eichler EE, Reese MG, De La Vega FM, Blanchard AP | display-authors = 6 | title = Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding | journal = Genome Research | volume = 19 | issue = 9 | pages = 1527–1541 | date = September 2009 | pmid = 19546169 | pmc = 2752135 | doi = 10.1101/gr.091868.109 }}

| 20-45

| 4-7

| 35-50

Life Technologies Ion Proton{{cite web | title= Ion Torrent | url= http://www.allseq.com/knowledgebank/sequencing-platforms/life-technologies-ion-torrent/ | access-date= 1 Jan 2014 | archive-url= https://web.archive.org/web/20131230044834/http://www.allseq.com/knowledgebank/sequencing-platforms/life-technologies-ion-torrent | archive-date= 30 December 2013 | url-status= dead }}

| Clonal-emPCR

| Native dNTPs, proton detection

| 200

| 0.5

| 100

Complete Genomics

| Gridded DNA-nanoballs

| Oligonucleotide 9-mer Unchained Ligation{{cite journal | vauthors = Drmanac R, Sparks AB, Callow MJ, Halpern AL, Burns NL, Kermani BG, Carnevali P, Nazarenko I, Nilsen GB, Yeung G, Dahl F, Fernandez A, Staker B, Pant KP, Baccash J, Borcherding AP, Brownley A, Cedeno R, Chen L, Chernikoff D, Cheung A, Chirita R, Curson B, Ebert JC, Hacker CR, Hartlage R, Hauser B, Huang S, Jiang Y, Karpinchyk V, Koenig M, Kong C, Landers T, Le C, Liu J, McBride CE, Morenzoni M, Morey RE, Mutch K, Perazich H, Perry K, Peters BA, Peterson J, Pethiyagoda CL, Pothuraju K, Richter C, Rosenbaum AM, Roy S, Shafto J, Sharanhovich U, Shannon KW, Sheppy CG, Sun M, Thakuria JV, Tran A, Vu D, Zaranek AW, Wu X, Drmanac S, Oliphant AR, Banyai WC, Martin B, Ballinger DG, Church GM, Reid CA | display-authors = 6 | title = Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays | journal = Science | volume = 327 | issue = 5961 | pages = 78–81 | date = January 2010 | pmid = 19892942 | doi = 10.1126/science.1181498 | s2cid = 17309571 | bibcode = 2010Sci...327...78D | doi-access = free }}{{cite journal | vauthors = Shendure J, Porreca GJ, Reppas NB, Lin X, McCutcheon JP, Rosenbaum AM, Wang MD, Zhang K, Mitra RD, Church GM | display-authors = 6 | title = Accurate multiplex polony sequencing of an evolved bacterial genome | journal = Science | volume = 309 | issue = 5741 | pages = 1728–1732 | date = September 2005 | pmid = 16081699 | doi = 10.1126/science.1117389 | s2cid = 11405973 | bibcode = 2005Sci...309.1728S | doi-access = free }}{{cite journal | vauthors = Peters BA, Kermani BG, Sparks AB, Alferov O, Hong P, Alexeev A, Jiang Y, Dahl F, Tang YT, Haas J, Robasky K, Zaranek AW, Lee JH, Ball MP, Peterson JE, Perazich H, Yeung G, Liu J, Chen L, Kennemer MI, Pothuraju K, Konvicka K, Tsoupko-Sitnikov M, Pant KP, Ebert JC, Nilsen GB, Baccash J, Halpern AL, Church GM, Drmanac R | display-authors = 6 | title = Accurate whole-genome sequencing and haplotyping from 10 to 20 human cells | journal = Nature | volume = 487 | issue = 7406 | pages = 190–195 | date = July 2012 | pmid = 22785314 | pmc = 3397394 | doi = 10.1038/nature11236 | bibcode = 2012Natur.487..190P }}

| 7x10

| 11

| 3000

Helicos Biosciences Heliscope

| Single Molecule

| Reversible Dye Terminator

| 35‡

| 8

| 25

Pacific Biosciences SMRT

| Single Molecule

| Phospholinked Fluorescent Nucleotides

| 10,000 (N50); 30,000+ (max){{Cite press release|url=https://www.globenewswire.com/news-release/2013/10/03/577891/16261/en/Pacific-Biosciences-Introduces-New-Chemistry-With-Longer-Read-Lengths-to-Detect-Novel-Features-in-DNA-Sequence-and-Advance-Genome-Studies-of-Large-Organisms.html|title=Pacific Biosciences Introduces New Chemistry With Longer Read Lengths to Detect Novel Features in DNA Sequence and Advance Genome Studies of Large Organisms|first=Pacific Biosciences of California|last=Inc|date=October 3, 2013|website=GlobeNewswire News Room}}

| 0.08

| 0.5{{cite web | vauthors = Nederbragt L | url=http://flxlexblog.wordpress.com/2013/07/05/de-novo-bacterial-genome-assembly-a-solved-problem/ | title=De novo bacterial genome assembly: a solved problem?| date=2013-07-05 }}


Run times and gigabase (Gb) output per run for single-end sequencing are noted. Run times and outputs approximately double when performing paired-end sequencing.

‡Average read lengths for the Roche 454 and Helicos Biosciences platforms.{{cite journal | vauthors = Voelkerding KV, Dames S, Durtschi JD | title = Next generation sequencing for clinical diagnostics-principles and application to targeted resequencing for hypertrophic cardiomyopathy: a paper from the 2009 William Beaumont Hospital Symposium on Molecular Pathology | journal = The Journal of Molecular Diagnostics | volume = 12 | issue = 5 | pages = 539–551 | date = September 2010 | pmid = 20805560 | pmc = 2928417 | doi = 10.2353/jmoldx.2010.100043 | name-list-style = amp }}

Template preparation methods for NGS

Two methods are used in preparing templates for NGS reactions: amplified templates originating from single DNA molecules, and single DNA molecule templates.

For imaging systems which cannot detect single fluorescence events, amplification of DNA templates is required. The three most common amplification methods are emulsion PCR (emPCR), rolling circle and solid-phase amplification. The final distribution of templates can be spatially random or on a grid.

=Emulsion PCR=

In emulsion PCR methods, a DNA library is first generated through random fragmentation of genomic DNA. Single-stranded DNA fragments (templates) are attached to the surface of beads with adaptors or linkers, and one bead is attached to a single DNA fragment from the DNA library. The surface of the beads contains oligonucleotide probes with sequences that are complementary to the adaptors binding the DNA fragments. The beads are then compartmentalized into water-oil emulsion droplets. In the aqueous water-oil emulsion, each of the droplets capturing one bead is a PCR microreactor that produces amplified copies of the single DNA template.{{cite encyclopedia | vauthors = Chee-Seng K, Yun LE, Yudi P, Kee-Seng C | title = Next Generation Sequencing Technologies and Their Applications. | encyclopedia = Encyclopedia of Life Sciences (ELS) | publisher = John Wiley & Sons, Ltd | location = Chichester | date = April 2010 }}{{cite journal | vauthors = Metzker ML | title = Sequencing technologies - the next generation | journal = Nature Reviews. Genetics | volume = 11 | issue = 1 | pages = 31–46 | date = January 2010 | pmid = 19997069 | doi = 10.1038/nrg2626 | s2cid = 205484500 }}{{cite journal | vauthors = Dressman D, Yan H, Traverso G, Kinzler KW, Vogelstein B | title = Transforming single DNA molecules into fluorescent magnetic particles for detection and enumeration of genetic variations | journal = Proceedings of the National Academy of Sciences of the United States of America | volume = 100 | issue = 15 | pages = 8817–8822 | date = July 2003 | pmid = 12857956 | pmc = 166396 | doi = 10.1073/pnas.1133470100 | doi-access = free | bibcode = 2003PNAS..100.8817D }}

=Gridded rolling circle nanoballs=

Amplification of a population of single DNA molecules by rolling circle amplification in solution is followed by capture on a grid of spots sized to be smaller than the DNAs to be immobilized.{{ cite patent|country=US|number=6485944|status=patent|title=Replica amplification of nucleic acid arrays|pubdate=2002-11-26|assign1=President and Fellows of Harvard College| inventor = Church GM, Mitra R }}{{cite journal | vauthors = Mitra RD, Church GM | title = In situ localized amplification and contact replication of many individual DNA molecules | journal = Nucleic Acids Research | volume = 27 | issue = 24 | pages = 34e–34 | date = December 1999 | pmid = 10572186 | pmc = 148757 | doi = 10.1093/nar/27.24.e34 }}{{cite patent|country=US|number=9624538|title=Nanogrid rolling circle DNA sequencing|pubdate=2017-04-18|assign1=President and Fellows of Harvard College| inventor = Church GM, Porreca GJ, Shendure J, Rosenbaum AM }}{{cite patent|country=US|number=8445194|status=patent|title=Single molecule arrays for genetic and chemical analysis|pubdate=2013-05-21|assign1=Callida Genomics Inc.| inventor = Drmanac R, Callow MJ, Drmanac S, Hauser BK, Yeung G }} Second-generation sequencing technologies like MGI Tech's DNBSEQ or Element Biosciences' AVITI use this approach for the preparation of the sample on the flow cell that is then imaged cycle by cycle.

=DNA colony generation (Bridge amplification)=

Forward and reverse primers are covalently attached at high-density to the slide in a flow cell. The ratio of the primers to the template on the support defines the surface density of the amplified clusters. The flow cell is exposed to reagents for polymerase-based extension, and priming occurs as the free/distal end of a ligated fragment "bridges" to a complementary oligo on the surface. Repeated denaturation and extension results in localized amplification of DNA fragments in millions of separate locations across the flow cell surface. Solid-phase amplification produces 100–200 million spatially separated template clusters, providing free ends to which a universal sequencing primer is then hybridized to initiate the sequencing reaction. This technology was filed for a patent in 1997 from Glaxo-Welcome's Geneva Biomedical Research Institute (GBRI), by Pascal Mayer, Eric Kawashima, and Laurent Farinelli, and was publicly presented for the first time in 1998.{{cite conference | vauthors = Mayer P, Matton G, Adessi C, Turcatti G, Mermod JJ, Kawashima E | conference = Fifth International Automation in Mapping and DNA Sequencing Conference |location = St. Louis, MO, USA|date=October 7–10, 1998|title=A very large scale, high throughput and low cost DNA sequencing method based on a new 2-dimensional DNA auto-patterning process|url=http://www.slideshare.net/pascalmayer/dna-colony-massively-parrallel-sequencing-ams98-presentation

| quote = DNA colony massively parallel sequencing ams98 presentation}} In 1994 Chris Adams and Steve Kron filed a patent on a similar, but non-clonal, surface amplification method, named “bridge amplification”{{cite patent|country=US|number=5641658|pubdate=1997-06-24|title=Method for performing amplification of nucleic acid with two primers bound to a single solid support|assign1=Mosaic Technologies Inc.|assign2=Whitehead Institute for Biomedical Research| inventor = Adams CP, Kron SJ }} adapted for clonal amplification in 1997 by Church and Mitra.

=Single-molecule templates=

Protocols requiring DNA amplification are often cumbersome to implement and may introduce sequencing errors. The preparation of single-molecule templates is more straightforward and does not require PCR, which can introduce errors in the amplified templates. AT-rich and GC-rich target sequences often show amplification bias, which results in their underrepresentation in genome alignments and assemblies.

Single molecule templates are usually immobilized on solid supports using one of at least three different approaches. In the first approach, spatially distributed individual primer molecules are covalently attached to the solid support. The template, which is prepared by randomly fragmenting the starting material into small sizes (for example,~200–250 bp) and adding common adapters to the fragment ends, is then hybridized to the immobilized primer. In the second approach, spatially distributed single-molecule templates are covalently attached to the solid support by priming and extending single-stranded, single-molecule templates from immobilized primers. A common primer is then hybridized to the template.

In either approach, DNA polymerase can bind to the immobilized primed template configuration to initiate the NGS reaction. Both of the above approaches are used by Helicos BioSciences. In a third approach, spatially distributed single polymerase molecules

are attached to the solid support, to which a primed template molecule is bound. This approach is used by Pacific Biosciences. Larger DNA molecules (up to tens of thousands of base pairs) can be used with this technique and, unlike the first two approaches, the third approach can be used with real-time methods, resulting in potentially longer read lengths.

Sequencing approaches

= Sequencing by synthesis =

The objective for sequential sequencing by synthesis (SBS) is to determine the sequencing of a DNA sample by detecting the incorporation of a nucleotide by a DNA polymerase. An engineered polymerase is used to synthesize a copy of a single strand of DNA and the incorporation of each nucleotide is monitored. The principle of sequencing by synthesis was first described in 1993 with improvements published some years later. The key parts are highly similar for all embodiments of SBS and include (1) amplification of DNA to enhance the subsequent signal and to attach the DNA to be sequenced to a solid support,  (2) generation of single stranded DNA on the solid support, (3) incorporation of nucleotides using an engineered polymerase and (4) detection of the incorporation of nucleotide. Then steps 3-4 are repeated and the sequence is assembled from the signals obtained in step 4. This principle of sequencing-by-synthesis has been used for almost all massive parallel sequencing instruments, including 454, PacBio, IonTorrent, Illumina and MGI.

= Pyrosequencing =

The principle of Pyrosequencing was first described in 1993 by combining a solid support with an engineered DNA polymerase lacking 3´to 5´exonuclease activity (proof-reading) and luminescence real-time detection using the firefly luciferase. All the key concepts of sequencing by synthesis were introduced, including (1) amplification of DNA to enhance the subsequent signal and attach the DNA to be sequenced (template) to a solid support, (2) generation of single stranded DNA on the solid support (3) incorporation of nucleotides using an engineered polymerase and (4) detection of the incorporated nucleotide by light detection in real-time. In a follow-up article, the concept was further developed and in 1998, an article {{Cite journal |last1=Ronaghi |first1=Mostafa |last2=Uhlén |first2=Mathias |last3=Nyrén |first3=Pål |date=1998-07-17 |title=A Sequencing Method Based on Real-Time Pyrophosphate |url=https://www.science.org/doi/10.1126/science.281.5375.363 |journal=Science |language=en |volume=281 |issue=5375 |pages=363–365 |doi=10.1126/science.281.5375.363 |pmid=9705713 |s2cid=26331871 |issn=0036-8075|url-access=subscription }} was published in which the authors showed that non-incorporated nucleotides could be removed with a fourth enzyme (apyrase) allowing sequencing by synthesis to be performed without the need for washing away non-incorporated nucleotides.

=Sequencing by reversible terminator chemistry=

This approach uses reversible terminator-bound dNTPs in a cyclic method that comprises nucleotide incorporation, fluorescence imaging and cleavage.

A fluorescently-labeled terminator is imaged as each dNTP is added and then cleaved to allow incorporation of the next base.

These nucleotides are chemically blocked such that each incorporation is a unique event. An imaging step follows each base incorporation step, then the blocked group is chemically removed to prepare each strand for the next incorporation by DNA polymerase. This series of steps continues for a specific number of cycles, as determined by user-defined instrument settings. The 3' blocking groups were originally conceived as either enzymatic{{cite patent|title=Polynucleotide sequencing|country=US|number=6833246|pubdate=2004-12-21|assign1=Solexa Ltd.| inventor = Balasubramanian S }} or chemical reversal{{cite patent|title=Massive parallel method for decoding DNA and RNA|country=US|number=7790869|status=patent|pubdate=2010-09-07|assign1=The Trustees of Columbia University in the City of New York| inventor = Ju J, Li Z, Edwards JR, Itagaki Y }}{{cite journal | vauthors = Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, etal | title = Accurate whole human genome sequencing using reversible terminator chemistry | journal = Nature | volume = 456 | issue = 7218 | pages = 53–59 | date = November 2008 | pmid = 18987734 | pmc = 2581791 | doi = 10.1038/nature07517 | bibcode = 2008Natur.456...53B }} The chemical method has been the basis for the Solexa and Illumina machines.

Sequencing by reversible terminator chemistry can be a four-colour cycle such as used by Illumina/Solexa, or a one-colour cycle such as used by Helicos BioSciences.

Helicos BioSciences used “virtual Terminators”, which are unblocked terminators with a second nucleoside analogue that acts as an inhibitor. These terminators have the appropriate modifications for terminating or inhibiting groups so that DNA synthesis is terminated after a single base addition.{{cite web |url=http://www.illumina.com/company/assay_technology.ilmn |title=Assay Technology |publisher=Illumina |access-date=2012-08-05 |archive-url=https://web.archive.org/web/20120826030201/http://www.illumina.com/company/assay_technology.ilmn |archive-date=2012-08-26 |url-status=dead }}{{cite web |url=http://www.helicosbio.com/Technology/tabid/64/Default.aspx |title=True Single Molecule Sequencing (tSMS™): Helicos BioSciences |publisher=Helicosbio.com |access-date=2012-08-05 |archive-url=https://web.archive.org/web/20120311112120/http://www.helicosbio.com/Technology/tabid/64/Default.aspx |archive-date=2012-03-11 |url-status=dead }}

=Sequencing-by-ligation mediated by ligase enzymes=

In this approach, the sequence extension reaction is not carried out by polymerases but rather by DNA ligase and either one-base-encoded probes or two-base-encoded probes. In its simplest form, a fluorescently labelled probe hybridizes to its complementary sequence adjacent to the primed template. DNA ligase is then added to join the dye-labelled probe to the primer. Non-ligated probes are washed away, followed by fluorescence imaging to determine the identity of the ligated probe.

The cycle can be repeated either by using cleavable probes to remove the fluorescent dye and regenerate a 5′-PO4 group for subsequent ligation cycles (chained ligation{{cite web|url=http://appliedbiosystems.cnpg.com/Video/flatFiles/699/index.aspx |title=Fundamentals of 2 Base Encoding and Color Space |publisher=Appliedbiosystems.cnpg.com |access-date=2012-08-05}}) or by removing and hybridizing a new primer to the template (unchained ligation).

=Phospholinked Fluorescent Nucleotides or Real-time sequencing=

Pacific Biosciences is currently leading this method.

The method of real-time sequencing involves imaging the continuous incorporation of dye-labelled nucleotides during DNA synthesis: single DNA polymerase molecules

are attached to the bottom surface of individual zero-mode waveguide detectors (Zmw detectors) that can obtain sequence information while phospholinked nucleotides are being incorporated into the growing primer strand.

Pacific Biosciences uses a unique DNA polymerase which better incorporates phospholinked nucleotides and enables the resequencing of closed circular templates.

While single-read accuracy is 87%, consensus accuracy has been demonstrated at 99.999% with multi-kilobase read lengths.{{cite journal | vauthors = Chin CS, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, Clum A, Copeland A, Huddleston J, Eichler EE, Turner SW, Korlach J | display-authors = 6 | title = Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data | journal = Nature Methods | volume = 10 | issue = 6 | pages = 563–569 | date = June 2013 | pmid = 23644548 | doi = 10.1038/nmeth.2474 | s2cid = 205421576 }}{{cite web | title=PacBio Users Report Progress in Long Reads for Plant Genome Assembly, Tricky Regions of Human Genome |date=March 5, 2013 |author=Monica Heger | url=http://www.genomeweb.com/sequencing/pacbio-users-report-progress-long-reads-plant-genome-assembly-tricky-regions-hum}} In 2015, Pacific Biosciences released a new sequencing instrument called the Sequel System, which increases capacity approximately 6.5-fold.{{Cite web | url=https://www.genomeweb.com/business-news/pacbio-launches-higher-throughput-lower-cost-single-molecule-sequencing-system |title = PacBio Launches Higher-Throughput, Lower-Cost Single-Molecule Sequencing System|date = October 2015}}{{cite web |url=http://www.bio-itworld.com/2015/9/30/pacbio-announces-sequel-sequencing-system.aspx |title=PacBio Announces Sequel Sequencing System - Bio-IT World |website=www.bio-itworld.com |url-status=dead |archive-url=https://web.archive.org/web/20151002033528/http://www.bio-itworld.com/2015/9/30/pacbio-announces-sequel-sequencing-system.aspx |archive-date=2015-10-02}}

See also

References