Introduction to next-generation sequencing technology

 

Next-generation sequencing (NGS) refers to more recent DNA and RNA sequencing technologies that offer benefits over previous methods. NGS enables scientists and researchers to perform high-throughput sequencing of nucleic acid fragments in parallel, often to identify genetic variation associated with diseases or other biological phenomena.

 

NGS is often also described as high-throughput, massively parallel, or deep sequencing because it is used to determine the nucleotide sequences of millions of DNA or RNA fragments simultaneously to provide detailed insights into genetic information.

 

Relative to previous methods for DNA sequencing, such as Sanger sequencing, NGS has brought improved speed, throughput, and accuracy of sequencing to enable groundbreaking discoveries in biological research. The cost of performing NGS has decreased significantly over the past decade, and the use of automation to simplify the most laborious parts of the workflow has made NGS much easier to perform, making this important technology accessible to more laboratories worldwide for use in many applications. 

More recently, NGS testing has become widely adopted in many settings, and its use and utility are still increasing. NGS tests have helped change expectations around the insights that can be gained from the ability to detect specific DNA sequences and genetic variants, and further innovations and developments are likely to continue the adoption of NGS in additional areas of molecular biology.

 

This article provides an overview of the main steps in the NGS workflow, some of the primary application areas of NGS, the benefits of NGS testing relative to other methods, and concludes with some resources describing how to get started with performing NGS testing.


Overview of the NGS workflow steps

Step 1: Isolate and purify nucleic acid

Nucleic acid purification is critical for NGS workflows. The process generally begins with a lysis step to release DNA and RNA from cells and nuclei. The sample can then be combined with organic solvents to separate nucleic acids from proteins and other debris. Nucleic acids are then precipitated and resuspended in a buffer. Following extraction, the concentration of DNA and purity of the sample can then be measured with various methods. Nucleic acid purification can be performed manually or with automated liquid handling. Automated methods reduce human error and are well-suited for high-volume labs, while manual methods offer flexibility for smaller-scale or specialized applications.

Some considerations for isolating nucleic acids:

 

Manual vs automated purification: Manual purification involves a technician using kits and reagents to extract DNA and RNA, which requires wet-lab training and expertise. Automated systems can be used to streamline purification workflows, increasing throughput and reproducibility, making this approach beneficial to many labs.

 

Measure DNA/RNA quality: Quality assessment of DNA/RNA is crucial for sequencing accuracy. Spectrophotometry measures purity and concentration, while fluorometry provides more sensitive quantification. High-quality nucleic acids support reliable downstream applications.

Step 2: Preparing an NGS library

Library preparation is defined by the methods used to convert a nucleic acid sample into a sample that can be sequenced. This process is often considered one of the most laborious and challenging parts of the NGS workflow. NGS library preparation includes many important steps such as amplifying target sequences, fragmentation, attaching adapters, selecting for specific fragment lengths, quantification, and pooling. Not all the steps described are necessary, and additional steps may be needed. For example, RNA-seq requires the step of converting RNA into cDNA by reverse transcription. Regardless of the specific approach, proper library preparation is crucial for accurate and efficient sequencing, making it a key aspect of NGS workflows.

Some important steps involved in performing NGS library preparation include:

 

Amplify targets: For some types of NGS, like amplicon-based targeted sequencing, amplification is used to enrich for specific DNA/RNA sequences, helping ensure sufficient material for sequencing. Techniques like polymerase chain reaction (PCR) are commonly used to selectively amplify regions of interest.

 

Fragmentation: Many NGS workflows, such as whole genome sequencing, use a fragmentation step to break nucleic acids into smaller lengths for sequencing. This can be achieved through mechanical shearing, enzymatic digestion, or sonication, creating fragments of optimal size for library preparation.

 

Adapter ligation: Adapter ligation attaches short DNA sequences (adapters) to the ends of fragmented DNA/RNA. These adapters are essential for binding to sequencing platforms and allow for the identification and amplification of fragments during sequencing.

 

Size selection: Size selection steps process DNA/RNA fragments into a specific size range, helping improve sequencing efficiency and accuracy. Techniques like gel electrophoresis or bead-based methods are used to select appropriately sized fragments.

 

Library quantification: Library quantification measures the concentration of DNA/RNA fragments in the prepared library. Accurate quantification is crucial for optimal sequencing performance, often done using qPCR or fluorometric methods.

 

Library pooling: Library pooling combines multiple samples into a single sequencing run, improving throughput and cost-efficiency. Careful quantification and balancing of each library are necessary to obtain the desired amount of data for each sample from in the sequencing output.

 

Manual vs automated library prep: Manual library preparation involves hands-on steps that can be time-consuming and prone to variability. Automated systems standardize the process, increasing efficiency, reproducibility, and throughput, making them suitable for high-volume sequencing projects.

Step 3: Sequencing

NGS instruments transform nanograms of DNA or RNA into gigabytes of data. The process often begins with run planning, which initiates the instrument for sequencing and sets parameters such as fragment length and sample multiplexing. The DNA library is then attached to a solid substrate such as a bead or flow cell and loaded onto the instrument along with the necessary reagents and solutions. The instrument then reads the nucleotide sequence, often by incorporating labeled nucleotides in a process known as sequencing by synthesis. Base calling converts raw data into information known as reads, and a file is created to include all reads along with quality metrics that reflect sequencing confidence. Sequencing statistics are also provided to summarize the run performance. The output file can then be used for downstream analysis.

Some key steps in the sequencing process include:

 

Run planning: Run planning involves setting up sequencing parameters, including sample loading, sequencing chemistry, and run duration. Proper planning helps support optimal data quality and efficient use of sequencing resources.

 

Clonal amplification: Clonal amplification generates multiple copies of each DNA fragment, creating clusters or colonies. This step is essential for signal detection during sequencing, typically achieved through emulsion or bridge PCR.

 

Sequencing by synthesis: Sequencing by synthesis reads the nucleotide sequence of DNA fragments by incorporating labeled nucleotides into an anchored template. Each incorporation event is detected, allowing the determination of the DNA sequence. Other approaches can also be used for NGS, such as nanopore sequencing, single-molecule real-time sequencing (SMRT), and sequencing by ligation.

 

Base calling and data output:  Base calling converts raw sequencing data into nucleotide sequences, generating FASTQ files that include quality metrics. This step is crucial for assessing the accuracy and reliability of the sequencing run, providing essential information for downstream analysis.

Step 4: Analyzing NGS data

NGS data analysis, often broadly referred to as bioinformatics, comprises a series of essential computational steps that are designed to generate accurate and meaningful results from DNA sequencing data. Some of these steps include demultiplexing, quality filtering, alignment to a reference genome, and variant calling. Although highly variable across different applications, NGS analysis can be integrated into a single, end-to-end pipeline that can be executed in an automated manner. Downstream analytical and graphical tools enable data interpretation. Reporting components then compile results, metrics, and visualizations into comprehensive outputs that compared across several sequencing runs.

Some important steps in NGS data analysis include:

 

Demultiplexing: Demultiplexing separates sequences based on unique barcode identifiers. This step processes the raw FASTQ data that includes all pooled libraries to correctly assign reads to write a new set of files that represent each sample, enabling accurate downstream analysis.

 

Quality filtering: Quality filtering removes low-quality reads and sequences with errors, helping ensure that only high-quality data are used for analysis. This step improves the reliability and accuracy of the final results.

 

Alignment to reference: Alignment maps sequencing reads to a reference genome, producing sequencing alignment map (SAM) and binary alignment map (BAM) files. This step determines the genomic location of sequences, quantifying read depth across regions of interest, and enabling reliable detection of genetic variation.

 

Variant calling: Variant calling identifies genetic mutations such as single nucleotide polymorphisms (SNPs), insertions, deletions, and copy number variations (CNVs) from aligned sequences, generating variant call format (VCF) files. This step is crucial for understanding genetic differences and their potential implications.

 

Results visualization: Results visualization uses various tools to graphically represent sequencing data, making it easier to interpret and analyze. Visualization aids in identifying patterns, anomalies, and insights within the data.

 

Analysis and reporting tools: Analysis and reporting tools compile sequencing data into comprehensive reports, providing detailed insights and interpretations. These tools are essential for communicating findings and making informed decisions based on the sequencing results.


Next-generation sequencing testing applications

Next-generation sequencing has revolutionized various scientific fields with its ability to rapidly and accurately sequence DNA and RNA. Some key application areas that have been accelerated through NGS include:

 

Cancer research

NGS helps with identification of mutations, copy number variations, gene fusions, and other genetic abnormalities important for oncology research from solid tumors, liquid biopsies, and other biological material.

Reproductive health research

Next-generation sequencing supports reproductive health research by enabling the analysis of genetic variation relevant to fertility and inherited conditions across diverse populations.

Infectious disease and microbiology research 

NGS can be used to determine the species-specific identity of viruses, bacteria, and fungi from various sources, as well as investigation of their evolution in hosts or other environments. 

Inherited disease research

NGS-based research helps to understand links between genomic variants and hereditary disorders and familial syndromes.

Other NGS testing applications:

  • Pharmacogenomics—studies using NGS help scientists understand how genetic variation influences individual drug response and metabolism
  • Agricultural research—NGS can be used for trait identification and selective breeding research to enhance outcomes such as yield, pathogen resistance, and drought tolerance
  • Environmental research—analysis of genetic diversity within ecosystems, as well as the roles and interactions of microorganisms, are being accelerated using NGS
  • Forensic research—NGS-based genetic analysis is being used in research studies related to identity testing, criminal investigations, and ancestry determination

Targeted next-generation sequencing approach

Multiple NGS approaches have been developed for different purposes, including whole genome sequencing (WGS) and targeted next-generation sequencing. Targeted sequencing is a focused and cost-effective NGS approach that is used to analyze specific regions of the genome. Targeted NGS offers several advantages over WGS, and in many cases is the desired NGS approach.

 

Key benefits include:

  • Improved sequencing coverage—enhances accuracy by concentrating sequencing depth on selected genomic regions (learn more)
  • Cost-effective—often reduces overall sequencing costs compared to whole genome sequencing
  • Simplified analysis—eases interpretation and data management
  • Specific applications—well-suited for studying genes of interest or research on chromosomal regions that are associated with diseases

What are the benefits of NGS testing?

NGS has enabled new applications in DNA and RNA sequencing by offering high throughput, speed, and accuracy to accelerate research timelines. This technology can offer valuable insights for understanding complex biological mechanisms. Its ability to process large numbers of samples in a single run enables cost-efficient studies that explore genetic diversity at scale.

 

High-resolution sequencing and advanced error correction helps to reduce false results and strengthen data reliability. Together, these advantages make NGS a powerful and accessible technology that drives discovery and advances genomic research across a wide range of scientific disciplines.


How to implement next-generation sequencing in your laboratory

Now that you’ve read about what NGS is and some of the main application areas being advanced by NGS, you might be wondering about the steps involved in getting up and running with NGS in your laboratory. Watch this video to learn how easy it is to set up your own NGS lab at your institution.

 

In this video, you will learn:

  • The requirements for laboratory space and environmental conditions to support optimal NGS testing
  • The importance of creating a physically separate space for pre- and post-amplification processes in the lab to prevent contamination, as well as the need for cleanliness
  • The laboratory equipment and instruments needed to support the NGS workflow
  • The laboratory personnel requirements to run NGS testing in-house
  • The need for a quality management system (QMS) to govern the lab operations that include standard operating procedures (SOPs), calibration and maintenance of equipment, personnel training, and instrument performance qualification