Global Growth of Long Read Sequencing Reveals the Complete Picture

Author : Ashwini cmi | Published On : 22 Apr 2024

Emergence of Third Generation Sequencing Technologies

These platforms allowed scientists to sequence DNA strands hundreds to tens of thousands of bases in length, far surpassing the read lengths of conventional short read second generation technologies. Early platforms such as the Pacific Biosciences (PacBio) Single Molecule Real Time (SMRT) sequencing and Oxford Nanopore sequencing relied on observing single DNA molecules as they passed through protein or solid state nanopores. This allowed them to sequence single molecules of DNA in one continuous read without needing to clone or amplify the DNA.

Adoption of Long Read Technologies Gains Momentum

Initially hampered by relatively high error rates and lower throughput compared to short read technologies, Long Read Sequencing saw early adoption amongst plant and microbial genomics communities where continuous longer reads were invaluable for assembling complex genomes. As error rates dropped and throughput increased, long reads saw more widespread use. Applications included resolving complex regions of mammals genomes, phasing genetic variants, detecting epigenetic modifications and isoforms from full length transcripts. By the late 2010s, both PacBio and Nanopore delivered refreshed platforms with improved throughput, read lengths and accuracy. This ignited more mainstream adoption across human, cancer and disease genomics.

Complete Genome Assembly with Long Reads

One area that has greatly benefited is complete de novo genome assembly. Early bacterial and small eukaryotic genomes could be readily assembled from short reads alone. However, assembling complex mammalian and plant genomes with many repeats and structural variations posed major challenges. Long reads, sequence contigs could span repetitive regions that formerly fragmented assemblies. Early applications included completing reference genome sequences like the giant panda and improving existing human references like the Chinese genomes.

Genome assemblies have now been completed for many medically-relevant species using long reads alone or in combination with other data types. Examples include cassava, papaya, loblolly pine, Atlantic salmon and others economically important for food and bioproducts. In human genomics, improving reference-grade assemblies enables more precise genetics and sheds light on structural variants linked to disease. It also aids phylogenetic studies by producing chromosomal-level assemblies to better understand hominin evolution.

Clinical Applications of Long Reads Emerge

Long reads are now finding clinical applications for analyzing smaller but more complex genomes. Human disease genomes can be completely resolved to identify pathogenic structural variants missed by short reads. This includes diagnosing genetic disorders and rare diseases. Long reads are also enabling clinical whole genome sequencing by resolving phasing of variants in a single long read to determine compound heterozygous alleles causing recessive diseases.

Long reads are used in cancer genomics for detecting fusion genes and structural changes driving oncogenesis. They can resolve complex cancer genomes to better understand clonal evolution, metastasis and therapeutic resistance. Applications include monitoring circulating tumor DNA (ctDNA) in liquid biopsies for non-invasive monitoring of disease burden and relapse. Looking ahead, clinical applications are likely to expand as costs decline and regulatory frameworks are established to realize the benefits of complete genomic information for precision medicine.

Global Long Read Sequencing Initiatives

Many national and international projects utilize long read technologies to generate reference-quality genomes. Examples include the Genome Projects like GenomeAsia 100K and GenomeCanada, the UK10K initiative, and the National Institutes of Health (NIH) All of Us Research Program and others. Large scale efforts aim to improve human reference frameworks, map genetic diversity and discovery disease-linked variants.

International consortia focused on agriculturally and environmentally important species are producing high quality genome references for plants, livestock and marine organisms. This includes cassava and banana from global food security programs like CGIAR, Atlantic salmon from industry-government partnerships and loblolly pine as a model plant. Microbial reference genomes plays a key role in infectious disease surveillance. Long reads enable assembling pathogen genome to track pandemics and antimicrobial resistance (AMR).

Future Outlook for Long Read Technologies

Technological innovations continue to improve long read platforms. New platforms promise higher single-molecule read lengths up to 100kb and beyond, coupled with cost reductions will make third generation sequencing increasingly mainstream. Analytical methods are advancing to fully leverage multi omics long read data for phasing, epigenomics and isoform decoding. Widespread deployment in clinical and biomedical research promises to revolutionize precision medicine by resolving complete disease genomes at scale. Looking ahead, third generation sequencing will yield new insights by providing an unprecedented complete view of genomes across all domains of life.

 

Get more insights on This Topic- Global Long Read Sequencing