Single-molecule real-time sequencing

{{Short description|Method for sequencing DNA}}

Single-molecule real-time (SMRT) sequencing is a parallelized single molecule DNA sequencing method. Single-molecule real-time sequencing utilizes a zero-mode waveguide (ZMW).{{cite journal|display-authors=3|vauthors=Levene MJ, Korlach J, Turner SW, Foquet M, Craighead HG, Webb WW|date=2003|title=Zero-Mode Waveguides for Single-Molecule Analysis at High Concentrations|journal=Science|volume=299|issue=5607|pages=682–6|bibcode=2003Sci...299..682L|doi=10.1126/science.1079700|pmid=12560545|s2cid=6060239}} A single DNA polymerase enzyme is affixed at the bottom of a ZMW with a single molecule of DNA as a template. The ZMW is a structure that creates an illuminated observation volume that is small enough to observe only a single nucleotide of DNA being incorporated by DNA polymerase. Each of the four DNA bases is attached to one of four different fluorescent dyes. When a nucleotide is incorporated by the DNA polymerase, the fluorescent tag is cleaved off and diffuses out of the observation area of the ZMW where its fluorescence is no longer observable. A detector detects the fluorescent signal of the nucleotide incorporation, and the base call is made according to the corresponding fluorescence of the dye.{{cite journal|display-authors=3|vauthors=Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, Peluso P, Rank D, Baybayan P, Bettman B, Bibillo A, Bjornson K, Chaudhuri B, Christians F, Cicero R, Clark S, Dalal R, Dewinter A, Dixon J, Foquet M, Gaertner A, Hardenbol P, Heiner C, Hester K, Holden D, Kearns G, Kong X, Kuse R, Lacroix Y, Lin S, Lundquist P, Ma C, Marks P, Maxham M, Murphy D, Park I, Pham T, Phillips M, Roy J, Sebra R, Shen G, Sorenson J, Tomaney A, Travers K, Trulson M, Vieceli J, Wegener J, Wu D, Yang A, Zaccarin D, Zhao P, Zhong F, Korlach J, Turner S|date=2009|title=Real-Time DNA Sequencing from Single Polymerase Molecules|journal=Science|volume=323|issue=5910|pages=133–8|bibcode=2009Sci...323..133E|doi=10.1126/science.1162986|pmid=19023044|s2cid=54488479}}

Technology

The DNA sequencing is done on a chip that contains many ZMWs. Inside each ZMW, a single active DNA polymerase with a single molecule of single stranded DNA template is immobilized to the bottom through which light can penetrate and create a visualization chamber that allows monitoring of the activity of the DNA polymerase at a single molecule level. The signal from a phospho-linked nucleotide incorporated by the DNA polymerase is detected as the DNA synthesis proceeds which results in the DNA sequencing in real time.

= Template preparation =

To prepare the library, DNA fragments are put into a circular form using hairpin adapter ligations.{{cite book | last=Friedmann | first=Theodore | title=Advances in genetics | publisher=Academic | publication-place=Oxford | year=2012 | isbn=978-0-12-394395-8 | oclc=813987819 | language=nl | page=}}

= Phospholinked nucleotide =

For each of the nucleotide bases, there is a corresponding fluorescent dye molecule that enables the detector to identify the base being incorporated by the DNA polymerase as it performs the DNA synthesis. The fluorescent dye molecule is attached to the phosphate chain of the nucleotide. When the nucleotide is incorporated by the DNA polymerase, the fluorescent dye is cleaved off with the phosphate chain as a part of a natural DNA synthesis process during which a phosphodiester bond is created to elongate the DNA chain. The cleaved fluorescent dye molecule then diffuses out of the detection volume so that the fluorescent signal is no longer detected.{{Cite web|url=https://www.ndsu.edu/pubweb/~mcclean/plsc411/Pacific%20Biosciences-technology_backgrounder.pdf|title=Pacific Biosciences Develops Transformative DNA Sequencing Technology|date=2008|website=Pacific Biosciences Technology Backgrounder}}

= Zero-Mode Waveguide =

The zero-mode waveguide (ZMW) is a nanophotonic confinement structure that consists of a circular hole in an aluminum cladding film deposited on a clear silica substrate.{{cite journal|display-authors=3|vauthors=Korlach J, Marks PJ, Cicero RL, Gray JJ, Murphy DL, Roitman DB, Pham TT, Otto GA, Foquet M, Turner SW|date=2008|title=Selective aluminum passivation for targeted immobilization of single DNA polymerase molecules in zero-mode waveguide nanostructures|journal=PNAS|volume=105|issue=4|pages=1176–81|bibcode=2008PNAS..105.1176K|doi=10.1073/pnas.0710982105|pmc=2234111|pmid=18216253|doi-access=free}}

The ZMW holes are ~70 nm in diameter and ~100 nm in depth. Due to the behavior of light when it travels through a small aperture, the optical field decays exponentially inside the chamber.{{cite journal|display-authors=3|vauthors=Foquet M, Samiee KT, Kong X, Chauduri BP, Lundquist PM, Turner SW, Freudenthal J, Roitman DB|date=2008|title=Improved fabrication of zero-mode waveguides for single-molecule detection|journal=J. Appl. Phys.|volume=103|issue=3|pages=034301–034301–9|bibcode=2008JAP...103c4301F|doi=10.1063/1.2831366|s2cid=38892226}}{{cite journal | last1=Zhu | first1=Paul | last2=Craighead | first2=Harold G. | title=Zero-Mode Waveguides for Single-Molecule Analysis | journal=Annual Review of Biophysics | publisher=Annual Reviews | volume=41 | issue=1 | date=2012-06-09 | issn=1936-122X | doi=10.1146/annurev-biophys-050511-102338 | pages=269–293| pmid=22577821 }}

The observation volume within an illuminated ZMW is ~20 zeptoliters (20 X 10−21 liters). The observation volume being so low eliminates background fluorescence from the free, unincorporated fluorescent nucleotides present in the solution. Within this volume, the activity of DNA polymerase incorporating a single nucleotide can be readily detected where each nucleotide is a separate color.{{cite journal | last1=Baibakov | first1=Mikhail | last2=Barulin | first2=Aleksandr | last3=Roy | first3=Prithu | last4=Claude | first4=Jean-Benoît | last5=Patra | first5=Satyajit | last6=Wenger | first6=Jérôme | title=Zero-mode waveguides can be made better: fluorescence enhancement with rectangular aluminum nanoapertures from the visible to the deep ultraviolet | journal=Nanoscale Advances | volume=2 | issue=9 | date=1999-02-22 | doi=10.1039/D0NA00366B | pages=4153–4160 | pmid=36132755 | pmc=9417158 }}

Sequencing Performance

Sequencing performance can be measured in read length, accuracy, and total throughput per experiment. PacBio sequencing systems using ZMWs have the advantage of long read lengths, although error rates are on the order of 5-15% and sample throughput is lower than Illumina sequencing platforms.{{cite journal |last1=Pollock |first1=Jolinda |last2=Glendinning |first2=Laura |last3=Wisedchanwet |first3=Trong |last4=Watson |first4=Mick |date= 2018|title= The Madness of Microbiome: Attempting To Find Consensus "Best Practice" for 16S Microbiome Studies |journal= Applied and Environmental Microbiology |volume=84 |issue=7 |pages=e02627-17 |doi=10.1128/AEM.02627-17 |pmid=29427429 |pmc=5861821 |bibcode=2018ApEnM..84E2627P |doi-access=free }}

On 19 Sep 2018, Pacific Biosciences [PacBio] released the Sequel 6.0 chemistry, synchronizing the chemistry version with the software version. Performance is contrasted for large-insert libraries with high molecular weight DNA versus shorter-insert libraries below ~15,000 bases in length. For larger templates average read lengths are up to 30,000 bases. For shorter-insert libraries, average read length are up to 100,000 bases while reading the same molecule in a circle several times. The latter shorter-insert libraries then yield up to 50 billion bases from a single SMRT Cell.{{cite web|url=https://twitter.com/PacBio/status/1042417439441645570|title=PacBio Post|date=19 Sep 2018|website=Twitter}}

History

Pacific Biosciences (PacBio) commercialized SMRT sequencing in 2011,{{cite web|url=http://www.genomeweb.com/sequencing/pacbio-ships-first-two-commercial-systems-order-backlog-grows-44|title=PacBio Ships First Two Commercial Systems; Order Backlog Grows to 44|last=Karow J|date=3 May 2011|work=GenomeWeb|url-access=registration}} after releasing a beta version of its RS instrument in late 2010.{{cite web|url=http://www.genomeweb.com/sequencing/pacbio-reveals-beta-system-specs-rs-says-commercial-release-track-first-half-201|title=PacBio Reveals Beta System Specs for RS; Says Commercial Release is on Track for First Half of 2011|last=Karow J|date=7 Dec 2010|work=GenomeWeb|url-access=registration}}

= RS and RS II =

File:RSSmrtCell.jpg

At commercialization, read length had a normal distribution with a mean of about 1100 bases. A new chemistry kit released in early 2012 increased the sequencer's read length; an early customer of the chemistry cited mean read lengths of 2500 to 2900 bases.{{cite web|url=http://www.genomeweb.com/sequencing/after-year-testing-two-early-pacbio-customers-expect-more-routine-use-rs-sequenc|title=After a Year of Testing, Two Early PacBio Customers Expect More Routine Use of RS Sequencer in 2012|last=Karow J|date=10 Jan 2012|work=GenomeWeb|url-access=registration}}

The XL chemistry kit released in late 2012 increased average read length to more than 4300 bases.{{cite web|url=http://www.genomeweb.com/sequencing/pacbios-xl-chemistry-increases-read-lengths-and-throughput-cshl-tests-tech-rice|title=PacBio's XL Chemistry Increases Read Lengths and Throughput; CSHL Tests the Tech on Rice Genome|last=Heger M|date=13 Nov 2012|work=GenomeWeb|url-access=registration}}{{cite web|url=http://www.genomeweb.com/sequencing/pacbio-users-report-progress-long-reads-plant-genome-assembly-tricky-regions-hum|title=PacBio Users Report Progress in Long Reads for Plant Genome Assembly, Tricky Regions of Human Genome|last=Heger M|date=5 Mar 2013|work=GenomeWeb|url-access=registration}}

On August 21, 2013, PacBio released a new DNA polymerase Binding Kit P4. This P4 enzyme has average read lengths of more than 4,300 bases when paired with the C2 sequencing chemistry and more than 5,000 bases when paired with the XL chemistry.{{cite web|url=https://www.pacb.com/uncategorized/new-dna-polymerase-p4-delivers-higher/|title=New DNA Polymerase P4 Delivers Higher-Quality Assemblies Using Fewer SMRT Cells|date=21 Aug 2013|work=PacBio Blog}} The enzyme’s accuracy is similar to C2, reaching QV50 between 30X and 40X coverage. The resulting P4 attributes provided higher-quality assemblies using fewer SMRT Cells and with improved variant calling. When coupled with input DNA size selection (using an electrophoresis instrument such as BluePippin) yields average read length over 7 kilobases.{{cite web|url=http://flxlexblog.wordpress.com/2013/06/19/longing-for-the-longest-reads-pacbio-and-bluepippin/|title=Longing for the longest reads: PacBio and BluePippin|last=lexnederbragt|date=19 Jun 2013|work=In between lines of code}}

On October 3, 2013, PacBio released new reagent combination for PacBio RS II, the P5 DNA polymerase with C3 chemistry (P5-C3). Together, they extend sequencing read lengths to an average of approximately 8,500 bases, with the longest reads exceeding 30,000 bases.{{Cite web|url=https://www.pacb.com/uncategorized/new-chemistry-for-pacbio-rs-ii-provides/|title=New Chemistry for PacBio RS II Provides Average 8.5 kb Read Lengths for Complex Genome Studies|date=3 Oct 2013|website=PacBio Blog}} Throughput per SMRT cell is around 500 million bases demonstrated by sequencing results from the CHM1 cell line.{{cite journal|display-authors=3|vauthors=Chaisson MJ, Huddleston J, Dennis MY, Sudmant PH, Malig M, Hormozdiari F, Antonacci F, Surti U, Sandstrom R, Boitano M, Landolin JM, Stamatoyannopoulos JA, Hunkapiller MW, Korlach J, Eichler EE|date=2014|title=Resolving the complexity of the human genome using single-molecule sequencing|journal=Nature|volume=517|issue=7536|pages=608–11|doi=10.1038/nature13907|pmc=4317254|pmid=25383537|bibcode=2015Natur.517..608C}}

On October 15, 2014, PacBio announced the release of new chemistry P6-C4 for the RS II system, which represents the company's 6th generation of polymerase and 4th generation chemistry--further extending the average read length to 10,000 - 15,000 bases, with the longest reads exceeding 40,000 bases. The throughput with the new chemistry was estimated between 500 million to 1 billion bases per SMRT Cell, depending on the sample being sequenced.{{Cite web|url=http://investor.pacificbiosciences.com/news-releases/news-release-details/pacific-biosciences-releases-new-dna-sequencing-chemistry|title=Pacific Biosciences Releases New DNA Sequencing Chemistry to Enhance Read Length and Accuracy for the Study of Human and Other Complex Genomes|date=15 Oct 2014|website=Pacific Biosciences|type=Press Release}}{{cite web|url=https://www.pacb.com/blog/new-chemistry-boosts-average-read/|title=New Chemistry Boosts Average Read Length to 10 kb – 15 kb for PacBio RS II|date=15 Oct 2014|website=PacBio Blog}} This was the final version of chemistry released for the RS instrument.

Throughput per experiment for the technology is both influenced by the read length of DNA molecules sequenced as well as total multiplex of a SMRT Cell. The prototype of the SMRT Cell contained about 3000 ZMW holes that allowed parallelized DNA sequencing. At commercialization, the SMRT Cells were each patterned with 150,000 ZMW holes that were read in two sets of 75,000.{{cite web|url=http://www.pacificbiosciences.com/products/consumables/SMRT-cells/|title=SMRT Cells, sequencing reagent kits, and accessories for the PacBio RS II|date=2020|work=Pacific Biosciences|access-date=2012-04-28|archive-date=2013-04-21|archive-url=https://web.archive.org/web/20130421130113/http://www.pacificbiosciences.com/products/consumables/SMRT-cells/|url-status=dead}} In April 2013, the company released a new version of the sequencer called the "PacBio RS II" that uses all 150,000 ZMW holes concurrently, doubling the throughput per experiment.{{cite web|url=http://nextgenseek.com/2013/04/pacbio-launches-pacbio-rs-ii-sequencer/|title=PacBio Launches PacBio RS II Sequencer|date=11 Apr 2013|work=Next Gen Seek|access-date=18 April 2013|archive-date=19 December 2019|archive-url=https://web.archive.org/web/20191219185400/http://nextgenseek.com/2013/04/pacbio-launches-pacbio-rs-ii-sequencer/|url-status=dead}}{{cite web|url=http://www.genomeweb.com/sequencing/new-products-pacbios-rs-ii-cufflinks|title=New Products: PacBio's RS II; Cufflinks|date=16 Apr 2013|work=GenomeWeb|url-access=registration}} The highest throughput mode in November 2013 used P5 binding, C3 chemistry, BluePippin size selection, and a PacBio RS II officially yielded 350 million bases per SMRT Cell though a human de novo data set released with the chemistry averaging 500 million bases per SMRT Cell. Throughput varies based on the type of sample being sequenced.{{cite web|url=https://twitter.com/DukeSequencing/status/373427511272538112|title=Duke Sequencing Post|date=30 Aug 2013|work=Twitter}} With the introduction of P6-C4 chemistry typical throughput per SMRT Cell increased to 500 million bases to 1 billion bases.

class="wikitable sortable"

|+ RS Performance

!

! C1

! C2

! P4-XL

! P5-C3

! P6-C4

Average read length bases

| 1100

| 2500 - 2900

| 4300 - 5000

| 8500

| 10,000 - 15,000

Throughput per SMRT Cell

| 30M - 40M

| 60M - 100M

| 250M - 300M

| 350M - 500M

| 500M - 1B

= Sequel =

File:SequelSmrtCell.jpg

In September 2015, the company announced the launch of a new sequencing instrument, the Sequel System, that increased capacity to 1 million ZMW holes.{{cite web|url=http://www.bio-itworld.com/2015/9/30/pacbio-announces-sequel-sequencing-system.aspx|title=PacBio Announces Sequel Sequencing System|date=30 Sep 2015|website=Bio-IT World|access-date=16 November 2015|archive-date=29 July 2020|archive-url=https://web.archive.org/web/20200729220749/http://www.bio-itworld.com/2015/9/30/pacbio-announces-sequel-sequencing-system.aspx|url-status=dead}}{{cite web|url=https://www.genomeweb.com/business-news/pacbio-launches-higher-throughput-lower-cost-single-molecule-sequencing-system|title=PacBio Launches Higher-Throughput, Lower-Cost Single-Molecule Sequencing System|last=Heger M|date=1 Oct 2015|website=GenomeWeb|url-access=registration}}

With the Sequel instrument initial read lengths were comparable to the RS, then later chemistry releases increased read length.

On January 23, 2017, the V2 chemistry was released. It increased average read lengths to between 10,000 and 18,000 bases.{{cite web|url=https://www.pacb.com/blog/new-chemistry-software-sequel-system-improve-read-length-lower-project-costs/|title=New Chemistry and Software for Sequel System Improve Read Length, Lower Project Costs|date=9 Jan 2017|website=PacBio Blog}}

On March 8, 2018, the 2.1 chemistry was released. It increased average read length to 20,000 bases and half of all reads above 30,000 bases in length. Yield per SMRT Cell increased to 10 or 20 billion bases, for either large-insert libraries or shorter-insert (e.g. amplicon) libraries respectively.{{cite web|url=https://www.pacb.com/blog/new-software-polymerase-sequel-system-boost-throughput-affordability/|title=New Software, Polymerase for Sequel System Boost Throughput and Affordability|date=7 Mar 2018|website=PacBio Blog}}

File:TipIn8MCell.jpg

On 19 September 2018, the company announced the Sequel 6.0 chemistry with average read lengths increased to 100,000 bases for shorter-insert libraries and 30,000 for longer-insert libraries. SMRT Cell yield increased up to 50 billion bases for shorter-insert libraries.

class="wikitable sortable"

|+ Sequel Performance

!

! V2

! 2.1

! 6.0

Average read length bases

| 10,000 - 18,000

| 20,000 - 30,000

| 30,000 - 100,000

Throughput per SMRT Cell

| 5B - 8B

| 10B - 20B

| 20B - 50B

= 8M Chip =

In April 2019 the company released a new SMRT Cell with eight million ZMWs,{{Cite web|url=https://www.bio-itworld.com/2019/04/26/pacbio-launches-sequel-ii-system.aspx|title=PacBio Launches Sequel II System|date=26 Apr 2019|website=Bio-IT World}} increasing the expected throughput per SMRT Cell by a factor of eight.{{Cite web |url=http://investor.pacificbiosciences.com/static-files/e53d5ef9-02cd-42ab-9d86-3037ad9deaec |title=Archived copy |access-date=2018-09-24 |archive-date=2018-09-24 |archive-url=https://web.archive.org/web/20180924070931/http://investor.pacificbiosciences.com/static-files/e53d5ef9-02cd-42ab-9d86-3037ad9deaec |url-status=dead }} Early access customers in March 2019 reported throughput over 58 customer run cells of 250 GB of raw yield per cell with templates about 15 kb in length, and 67.4 GB yield per cell with templates in higher weight molecules.{{Cite web|url=https://www.genomeweb.com/sequencing/pacbio-shares-early-access-customer-experiences-new-applications-sequel-ii|title=PacBio Shares Early-Access Customer Experiences, New Applications for Sequel II|last=Heger M|date=7 Mar 2019|website=GenomeWeb|url-access=registration}} System performance is now reported in either high-molecular-weight continuous long reads or in pre-corrected HiFi (also known as Circular Consensus Sequence (CCS)) reads. For high-molecular-weight reads roughly half of all reads are longer than 50 kb in length.

class="wikitable sortable"

|+ Sequel II High-Molecular-Weight Performance

!

! Early Access

! 1.0

! 2.0

Throughput per SMRT Cell

| ~67.4 GB

| Up to 160 GB

| Up to 200 GB

The HiFi performance includes corrected bases with quality above Phred score Q20, using repeated amplicon passes for correction. These take amplicons up to 20kb in length.

class="wikitable sortable"

|+ Sequel II HiFi Corrected Read Performance

!

! Early Access

! 1.0

! 2.0

Raw reads per SMRT Cell

| ~250 GB

| Up to 360 GB

| Up to 500 GB

Corrected reads per SMRT Cell (>Q20)

| ~25 GB

| Up to 36 GB

| Up to 50 GB

Application

Single-molecule real-time sequencing may be applicable for a broad range of genomics research.

For de novo genome sequencing, read lengths from the single-molecule real-time sequencing are comparable to or greater than that from the Sanger sequencing method based on dideoxynucleotide chain termination. The longer read length allows de novo genome sequencing and easier genome assemblies.{{cite journal|display-authors=3|vauthors=Rasko DA, Webster DR, Sahl JW, Bashir A, Boisen N, Scheutz F, Paxinos EE, Sebra R, Chin CS, Iliopoulos D, Klammer A, Peluso P, Lee L, Kislyuk AO, Bullard J, Kasarskis A, Wang S, Eid J, Rank D, Redman JC, Steyert SR, Frimodt-Møller J, Struve C, Petersen AM, Krogfelt KA, Nataro JP, Schadt EE, Waldor MK|date=2011|title=Origins of the E. coli Strain Causing an Outbreak of Hemolytic–Uremic Syndrome in Germany|journal=N. Engl. J. Med.|volume=365|issue=8|pages=709–17|doi=10.1056/NEJMoa1106920|pmc=3168948|pmid=21793740}}{{cite journal|display-authors=3|vauthors=Chin CS, Sorenson J, Harris JB, Robins WP, Charles RC, Jean-Charles RR, Bullard J, Webster DR, Kasarskis A, Peluso P, Paxinos EE, Yamaichi Y, Calderwood SB, Mekalanos JJ, Schadt EE, Waldor MK|date=2011|title=The Origin of the Haitian Cholera Outbreak Strain|journal=N. Engl. J. Med.|volume=364|issue=1|pages=33–42|doi=10.1056/NEJMoa1012928|pmc=3030187|pmid=21142692}} Scientists are also using single-molecule real-time sequencing in hybrid assemblies for de novo genomes to combine short-read sequence data with long-read sequence data.{{Cite journal|display-authors=3|vauthors=Gao H, Green SJ, Jafari N, Kiss A, Lyons R, Thomas WK|date=2012|title=Tech Tips: Next-Generation Sequencing|url=https://www.genengnews.com/magazine/180/tech-tips-next-generation-sequencing/4074/|journal=Genetic Engineering & Biotechnology News|volume=32|issue=8}}{{Cite web|url=http://schatzlab.cshl.edu/presentations/2011-09-07.PacBio%20Users%20Meeting.pdf|title=SMRT-assembly approaches|last=Schatz M|date=7 Sep 2011|website=schatzlab.cshl.edu|type=PacBio Users Meeting}} In 2012, several peer-reviewed publications were released demonstrating the automated finishing of bacterial genomes,{{cite journal|display-authors=3|vauthors=Ribeiro FJ, Przybylski D, Yin S, Sharpe T, Gnerre S, Abouelleil A, Berlin AM, Montmayeur A, Shea TP, Walker BJ, Young SK, Russ C, Nusbaum C, MacCallum I, Jaffe DB|date=2012|title=Finished bacterial genomes from shotgun sequence data|journal=Genome Res.|volume=22|issue=11|pages=2270–7|doi=10.1101/gr.141515.112|pmc=3483556|pmid=22829535}}{{cite journal|display-authors=3|vauthors=Bashir A, Klammer A, Robins WP, Chin CS, Webster D, Paxinos E, Hsu D, Ashby M, Wang S, Peluso P, Sebra R, Sorenson J, Bullard J, Yen J, Valdovino M, Mollova E, Luong K, Lin S, LaMay B, Joshi A, Rowe L, Frace M, Tarr CL, Turnsek M, Davis BM, Kasarskis A, Mekalanos JJ, Waldor MK, Schadt EE|date=2012|title=A hybrid approach for the automated finishing of bacterial genomes|journal=Nat. Biotechnol.|volume=30|issue=7|pages=701–7|doi=10.1038/nbt.2288|pmc=3731737|pmid=22750883}} including one paper that updated the Celera Assembler with a pipeline for genome finishing using long SMRT sequencing reads.{{cite journal|display-authors=3|vauthors=Koren S, Schatz MC, Walenz BP, Martin J, Howard JT, Ganapathy G, Wang Z, Rasko DA, McCombie WR, Jarvis ED, Phillippy AM|date=2012|title=Hybrid error correction and de novo assembly of single-molecule sequencing reads|journal=Nat. Biotechnol.|volume=30|issue=7|pages=693–700|doi=10.1038/nbt.2280|pmc=3707490|pmid=22750884}} In 2013, scientists estimated that long-read sequencing could be used to fully assemble and finish the majority of bacterial and archaeal genomes.{{cite journal|display-authors=3|vauthors=Koren S, Harhay GP, Smith TP, Bono JL, Harhay DM, Mcvey SD, Radune D, Bergman NH, Phillippy AM|date=2013|title=Reducing assembly complexity of microbial genomes with single-molecule sequencing|journal=Genome Biol.|volume=14|issue=9|pages=R101|arxiv=1304.3752|bibcode=2013arXiv1304.3752K|doi=10.1186/gb-2013-14-9-r101|pmc=4053942|pmid=24034426 |doi-access=free }}

The same DNA molecule can be resequenced independently by creating the circular DNA template and utilizing a strand displacing enzyme that separates the newly synthesized DNA strand from the template.{{cite journal|display-authors=3|vauthors=Smith CC, Wang Q, Chin CS, Salerno S, Damon LE, Levis MJ, Perl AE, Travers KJ, Wang S, Hunt JP, Zarrinkar PP, Schadt EE, Kasarskis A, Kuriyan J, Shah NP|date=2012|title=Validation of ITD mutations in FLT3 as a therapeutic target in human acute myeloid leukaemia|journal=Nature|volume=485|issue=7397|pages=260–3|bibcode=2012Natur.485..260S|doi=10.1038/nature11016|pmc=3390926|pmid=22504184}} In August 2012, scientists from the Broad Institute published an evaluation of SMRT sequencing for SNP calling.{{cite journal|display-authors=3|vauthors=Carneiro MO, Russ C, Ross MG, Gabriel SB, Nusbaum C, DePristo MA|date=2012|title=Pacific Biosciences Sequencing Technology for Genotyping and Variation Discovery in Human Data|journal=BMC Genom.|volume=13|issue=1|pages=375|doi=10.1186/1471-2164-13-375|pmc=3443046|pmid=22863213 |doi-access=free }}

The dynamics of polymerase can indicate whether a base is methylated.{{cite journal|display-authors=3|vauthors=Flusberg BA, Webster DR, Lee JH, Travers KJ, Olivares EC, Clark TA, Korlach J, Turner SW|date=2010|title=Direct detection of DNA methylation during single-molecule, real-time sequencing|journal=Nat. Methods|volume=7|issue=6|pages=461–5|doi=10.1038/nmeth.1459|pmc=2879396|pmid=20453866}} Scientists demonstrated the use of single-molecule real-time sequencing for detecting methylation and other base modifications.{{cite journal|display-authors=3|vauthors=Clark TA, Murray IA, Morgan RD, Kislyuk AO, Spittle KE, Boitano M, Fomenkov A, Roberts RJ, Korlach J|date=2012|title=Characterization of DNA Methyltransferase Specificities Using Single-Molecule, Real-Time DNA Sequencing|journal=Nucleic Acids Res.|volume=40|issue=4|pages=e29|doi=10.1093/nar/gkr1146|pmc=3287169|pmid=22156058}}{{cite journal|display-authors=3|vauthors=Song CX, Clark TA, Lu XY, Kislyuk A, Dai Q, Turner SW, He C, Korlach J|date=2011|title=Sensitive and Specific Single-Molecule Sequencing of 5-hydroxymethylcytosine|journal=Nat Methods|volume=9|issue=1|pages=75–7|doi=10.1038/nmeth.1779|pmc=3646335|pmid=22101853}}{{cite journal|display-authors=3|vauthors=Clark TA, Spittle KE, Turner SW, Korlach J|date=2011|title=Direct Detection and Sequencing of Damaged DNA Bases|journal=Genome Integr.|volume=2|issue=1|pages=10|doi=10.1186/2041-9414-2-10|pmc=3264494|pmid=22185597 |doi-access=free }} In 2012 a team of scientists used SMRT sequencing to generate the full methylomes of six bacteria.{{cite journal|display-authors=3|vauthors=Murray IA, Clark TA, Morgan RD, Boitano M, Anton BP, Luong K, Fomenkov A, Turner SW, Korlach J, Roberts RJ|date=2012|title=The Methylomes of Six Bacteria|journal=Nucleic Acids Res.|volume=40|issue=22|pages=11450–62|doi=10.1093/nar/gks891|pmc=3526280|pmid=23034806}} In November 2012, scientists published a report on genome-wide methylation of an outbreak strain of E. coli.{{cite journal|display-authors=3|vauthors=Fang G, Munera D, Friedman DI, Mandlik A, Chao MC, Banerjee O, Feng Z, Losic B, Mahajan MC, Jabado OJ, Deikus G, Clark TA, Luong K, Murray IA, Davis BM, Keren-Paz A, Chess A, Roberts RJ, Korlach J, Turner SW, Kumar V, Waldor MK, Schadt EE|date=2012|title=Genome-wide Mapping of Methylated Adenine Residues in Pathogenic Escherichia Coli Using Single-Molecule Real-Time Sequencing|journal=Nat. Biotechnol.|volume=30|issue=12|pages=1232–9|doi=10.1038/nbt.2432|pmc=3879109|pmid=23138224}}

Long reads make it possible to sequence full gene isoforms, including the 5' and 3' ends. This type of sequencing is useful to capture isoforms and splice variants.{{cite journal|display-authors=3|vauthors=Sharon D, Tilgner H, Grubert F, Snyder M|date=2013|title=A Single-Molecule Long-Read Survey of the Human Transcriptome|journal=Nat. Biotechnol.|volume=31|issue=11|pages=1009–14|doi=10.1038/nbt.2705|pmc=4075632|pmid=24108091}}{{cite journal|display-authors=3|vauthors=Au KF, Sebastiano V, Afshar PT, Durruthy JD, Lee L, Williams BA, van Bakel H, Schadt EE, Reijo-Pera RA, Underwood JG, Wong WH|date=2013|title=Characterization of the human ESC transcriptome by hybrid sequencing|journal=PNAS|volume=110|issue=50|pages=E4821–30|bibcode=2013PNAS..110E4821A|doi=10.1073/pnas.1320101110|pmc=3864310|pmid=24282307|doi-access=free}}

SMRT sequencing has several applications in reproductive medical genetics research when investigating families with suspected parental gonadal mosaicism. Long reads enable haplotype phasing in patients to investigate parent-of-origin of mutations. Deep sequencing enables determination of allele frequencies in sperm cells, of relevance for estimation of recurrence risk for future affected offspring.{{cite journal|display-authors=3|vauthors=Ardui S, Ameur A, Vermeesch JR, Hestand MS|date=2018|title=Single Molecule Real-Time (SMRT) Sequencing Comes of Age: Applications and Utilities for Medical Diagnostics|journal=Nucleic Acids Res.|volume=46|issue=5|pages=2159–68|doi=10.1093/nar/gky066|pmc=5861413|pmid=29401301}}{{cite journal|display-authors=3|vauthors=Wilbe M, Gudmundsson S, Johansson J, Ameur A, Stattin EL, Annerén G, Malmgren H, Frykholm C, Bondeson ML|date=2017|title=A Novel Approach Using Long-Read Sequencing and ddPCR to Investigate Gonadal Mosaicism and Estimate Recurrence Risk in Two Families With Developmental Disorders|journal=Prenatal Diagnosis|volume=37|issue=11|pages=1146–54|doi=10.1002/pd.5156|pmc=5725701|pmid=28921562}}

References

{{reflist|colwidth=30em}}