Run diverse Sanger sequencing and fragment analysis applications with the SeqStudio Genetic Analyzer
Designed for use by research assistants and scientists, the SeqStudio Genetic Analyzer is a low-throughput, easy-to-use, and convenient benchtop system. The features of the SeqStudio Genetic Analyzer make running capillary electrophoresis (CE) experiments easier with minimal hands-on time due to an all-in-one cartridge, facilitate collaboration through Thermo Fisher Cloud–based sharing and applications, and introduce new opportunities to run both sequencing and fragment analysis samples at one time.
Highlights of several powerful Sanger sequencing and fragment analysis applications:
Genome editing technologies, including CRISPR-Cas9– mediated editing events, are rapidly becoming accessible to a majority of biological science researchers, and are poised to revolutionize all fields of biology and health care. Thermo Fisher Scientific offers all the tools necessary for a genome editing project. As an integral part of such a project, the features of the SeqStudio Genetic Analyzer facilitate Sanger sequencing analyses and fit well within a genome editing workflow. In particular, the data generated are compatible with Tracking of Indels by Decomposition (TIDE) software, a widely available tool for analyzing the efficiency of genome editing events.
The utility of the SeqStudio Genetic Analyzer in a genome editing project was shown by obtaining whole-cell lysates from HEK293 cells that were edited to introduce random deletions around a targeted site in the HPRT or the relA locus. To confirm the position of the edit, the Sanger sequencing traces were uploaded to the cloud and analyzed using the Sanger Variant Analysis module (Figure 1). Note that the position of the edit is clearly indicated and can be visualized by the abundant mixed- base peaks downstream of the break. The efficiency of the edits in this mixed primary culture was determined by analyzing these trace files using the TIDE software. In each case, the spectrum and frequencies of deletions at each locus was nearly identical using the data generated in the forward and reverse directions (Figure 2). These frequencies confirm results obtained using Invitrogen TOPO cloning and followed by Sanger sequencing results of the same edited cell populations.
Figure 2. Analysis of two different genome editing events at the HPRT and relA loci using TIDE software and mixed population sequencing traces generated by the SeqStudio instrument. The bars show the proportion of the population having the indicated number of nucleotides deleted or inserted. For (A) HPRT, the overall efficiency of the edit was around 80%, whereas the overall efficiency at the (B) relA locus was around 20%.
The presence or absence of methylation at carbon position 5 in the cytosine base (5-meC) of a CpG dinucleotide motif is a key epigenetic regulator of cellular and developmental processes in higher eukaryotes. As a result, there is great interest in analyzing changes in DNA methylation patterns associated with cancer and other pathologies. This has led to the discovery of many genomic regions and specific loci where the presence or absence of CpG methylation influences cells and tissue physiology. However, retrospective studies using these formalin-fixed, paraffin-embedded (FFPE) samples are hampered by the degraded state of nucleic acids extracted from these samples. A DNA methylation analysis method that is compatible with FFPE tissues could open up avenues of research using this rich resource of retrospective samples.
Sanger sequencing has been the gold standard for 5-meC detection for over 25 years and allows the detection of 5-meC at multiple sites along a short genomic DNA (gDNA) region of up to 400 nucleotides. The most widely used method to detect 5-meC in DNA is bisulfite treatment of gDNA. Methyl-Seq Direct workflow is a novel method for bisulfite sequencing that uses the Applied Biosystems BigDye Direct (BDD) Cycle Sequencing Kit and the BigDye XTerminator for sequencing and purification. The entire workflow, including bisulfite conversion, PCR amplification, and DNA sequencing, takes about 7 hours, including about an hour of hands-on time. The workflow is compatible with FFPE samples, enabling the use of archived samples for retrospective studies in oncology or other pathologies.
To demonstrate the feasibility of the protocol with DNA extracted from FFPE tissue, two pairs of matched normal and tumor tissue samples were processed. Figure 1 shows that discrete changes are apparent in the methylation pattern between normal and tumor tissue.
In this application note we demonstrate the feasibility of a direct PCR sequencing workflow for methylation analysis of bisulfite-converted DNA from cell lines and FFPE samples using reagents, instrumentation, and software available from Thermo Fisher Scientific. The SeqStudio Genetic Analyzer, an affordable instrument capable of both Sanger sequencing and fragment analysis, enables rapid results for follow-up methylation research studies. This opens up the possibility of using such samples for retrospective studies.
The study of development of human diseases relies heavily on the analysis of dissociated human cell lines grown in culture. However, an increasingly acknowledged problem is that cells grown in vitro can be misidentified or contaminated with other unrelated cell lines. The identity of cell lines can be verified by analysis of a highly specific genetic “fingerprint” of highly variable short tandem repeats (STRs). The SeqStudio platform integrates well with the Thermo Fisher Scientific cell line authentication solution. The Applied Biosystems Identifiler Plus and Identifiler Direct kits can be used on purified and crude DNA preparations, respectively, for analyzing 16 highly variable human STR loci commonly used for verifying cell line authenticity. The Applied Biosystems GeneMapper Software, used for analyzing alleles identified by Identifiler kits, is compatible with data produced by the SeqStudio instrument, and the results can be used to query ATCC or other STR databases to verify authenticity.
Case Study: Matching identities of iPSCs and donors using CLA Identifiler STR profiling kits
Application Note: Authenticating human cell lines using the Identifiler kits and capillary electrophoresis platforms
Webinar: A Complete Workflow for Human Cell Line Authentication
To demonstrate the utility of the SeqStudio instrument in a cell line authentication workflow, allelic information on STRs was obtained from five different, commonly used human cell lines. The identity of the cell lines was confirmed even with as little as 300 pg of gDNA. To show the ability to detect contaminating cells, a population of M4A4GFP cells was spiked with varying amounts of HeLa cells and analyzed using the Identifiler Direct kit. HeLa cell–specific alleles could be detected even if only 10% of the population had HeLa cells (Figure 1). Therefore, when coupled with the Identifiler kits, the SeqStudio instrument can be a central component for a cell line authentication solution.
Figure 1. Analysis of cell line contamination on the SeqStudio instrument. HeLa cells and M4A4GFP cell suspensions were diluted to 5 x 105 cells/mL, mixed in the indicated proportions, and spotted onto NUCLEIC-CARD Sample Collection Device. Contaminating HeLa cells can be detected with high confidence on the SeqStudio instrument if they make up approximately 20% of a population; however, some alleles unique to HeLa can be detected if they make up as little as 10% of a population.
The SeqStudio Genetic Analyzer can be used by clinical researchers to maintain the gold-standard quality for detecting and verifying the presence of mutant alleles in tumor tissue. The SeqStudio system integrates with the following tools to simplify Sanger sequencing workflows:
- The SeqStudio Genetic Analyzer comes preloaded with running modules optimized for fragmented DNA extracted from formalin-fixed, paraffin-embedded tissue.
- The cloud-based NGC module allows investigators to rapidly verify variants identified in next-generation sequencing (NGS) .vcf files using Sanger sequencing traces.
- Allelic variants at frequencies down to 5% can be detected using the Applied Biosystems Minor Variant Finder (MVF) Software and Sanger traces generated by the SeqStudio instrument.
- Applied Biosystems BigDye Direct and BigDye XTerminator chemistries simplify the Sanger sequencing workflow by providing one-tube sequencing and clean-up.
The performance of the SeqStudio Genetic Analyzer for detecting mutant alleles in tumor samples was determined by analyzing genomic DNA extracted from 10 different FFPE tumor samples, and determining variant frequencies at 4 different hotspot regions. The frequency of mutant alleles was determined by NGS using the Ion Torrent Oncomine Oncology Focus Panel, and Sanger sequencing using BigDye Direct/BigDye XTerminator chemistries and MVF Software. The correlation between the frequencies measured by the SeqStudio Genetic Analyzer was excellent when compared to NGS at allele frequencies—from about 9% to about 70% (Figure 1).
The ability of the SeqStudio Genetic Analyzer to analyze variant frequencies was also determined using a 96-well plate containing Sanger sequencing primers that query the most common tumorigenic mutations in KRAS and NRAS. The minor allele frequency analysis of SeqStudio instrument traces accurately measured the allele frequencies in 1 ng of diluted FFPE-extracted DNA (Figure 2A). Therefore, researchers needing to detect rare alleles can be confident that the SeqStudio Genetic Analyzer will produce accurate results on FFPE tissues.
Finally, the cloud-based NGC application simplifies the confirmation of variants identified by NGS by organizing Sanger sequencing traces by amplicons and specimens, and aligning them in the proper orientation to the candidate variant sequences in a .vcf file. To show the utility of the NGC app in an oncology workflow, we confirmed the presence of an NRAS mutation identified using the Oncomine Oncology Focus panel by Sanger sequencing (Figure 2B). The SeqStudio results verified that the mutation in NRAS (p.Ala59Thr) was present. Therefore, focused and rapid examination of the most meaningful portions of sequencing traces by the NGC app facilitates NGS variant confirmation.
Figure 2. Analysis and confirmation of variants by SeqStudio Genetic Analyzer and the NGC application, respectively. (A) Eight different FFPE samples with mutations at known RAS hotspots were diluted to 5% allele frequency, then analyzed using a 96-well plate containing Sanger sequencing primers that query the most common tumorigenic mutations in KRAS and NRAS, and using the SeqStudio Genetic Analyzer. Each of the allele queries accurately measured the allele frequencies; deviations from 5% reflected slight inconsistencies in starting concentration of the samples. Yellow line is 5% frequency. Similar results were seen with 10% and 50% dilutions. (B) Confirmation of variants identified by NGS. From a .vcf file generated using Ion Reporter Software, Sanger sequencing primers targeting loci of interest were ordered from Primer Designer, samples were sequenced on the SeqStudio instrument, and variants common to the .vcf file and the Sanger sequencing traces were highlighted using the NGC cloud app.
The ready availability of genomic data opens the opportunity to identify species in an unknown sample by sequencing DNA of “fingerprint” loci. The Applied Biosystems family of kits, for example, the MicroSEQ kit, has simplified the identification of prokaryotes and fungi by Sanger sequencing ribosomal DNA (rDNA) sequences. Similarly, eukaryotic organisms can be identified using the mitochondrial-specific loci as the identifying locus. This strategy has been exploited in the Barcode of Life project (barcodeoflife.org), providing a means for rapidly establishing the identity of unknown eukaryotic samples.
To illustrate the performance of the SeqStudio Genetic Analyzer for microbial identification, we obtained genomic DNA samples from ATCC for a variety of microorganisms, and sequenced them using the Applied Biosystems MicroSEQ 500 PCR kit and the SeqStudio instrument. The resulting sequences were queried against the BLAST database. For each sequencing reaction, the correct organism was identified with the highest BLAST confidence. Similarly, using primers for fish mitochondrial sequences (CO1 gene) and fish samples, the fish species was correctly identified as the top BLAST hit. The accurate identification of the species queried with BLAST illustrates how well the SeqStudio platform can be used for species identification.
|Number of organisms||Number of queries||Percent correct|
Table 1. Analysis of species ID using the SeqStudio Genetic Analyzer. Samples of microorganism DNA or genomic DNA extracted from fish were sequenced using primers for 16s rDNA and the MicroSEQ kit (BigDye Terminator v1.1 chemistry), or using primers for fish mitochondrial CO1 sequences and BigDye Terminator v3.1 chemistry.
Genomic instability is a hallmark of many cancer types and is often caused by dysregulation of DNA mismatch repair (MMR) enzymes. As there are at least 11 different loci involved in MMR, looking for an inactivating event in all of them can be complicated, time consuming, and expensive. Furthermore, MMR analysis is done by immunohistochemistry (IHC) which is semiquantitative at best and relies on subjective interpretation that can vary from person to person. Is it therefore more advantageous to look for the functional outcome of dysregulated MMR—microsatellite instability (MSI). This can be done by molecular MSI assays which use PCR and fragment analysis.
We developed the TrueMark MSI Assay which includes an expanded 13-microsatellite marker panel, as well as 2 highly variable short tandem repeat (STR) sequences that can be used to track sample identity. The analysis of 13 loci across 5 fluorescence channels allows for an increase in the diversity of sequences that can be used to determine MSI status. The assay has been optimized for use with the SeqStudio and 3500 Series Genetic Analyzers. In addition, we developed the TrueMark MSI Analysis Software for analyzing the assay that does not require side-by-side analysis of normal, non-tumor tissue and offers automated calling.
To test the performance of the TrueMark MSI Assay on diverse tumor types, we obtained 317 colorectal (CRC), gastric, endometrial, and other FFPE tumor research samples from commercial sources. gDNA was extracted and analyzed using the protocol supplied with the TrueMark MSI Assay on the SeqStudio and 3500xl. The TrueMark MSI Assay was able to classify the tumors into either stable (MSS), low (MSI-L), or high (MSI-H) levels of MSI (Table 1). The number of MSI-H samples seen in the CRC, gastric, and endometrial tumors was similar to published results. In addition, our ability to detect MSI in other tumor types, including breast, abdominal, and lung tumors, indicates the usefulness of the TrueMark MSI Assay beyond the most generally recognized MSI events.
|Tumor type||Number of samples||MSS||MSI-L||MSI-H|
|Colon||73||47 (64%)||0||26 (36%)|
|Gastric||50||41 (82%)||2 (4%)||7 (14%)|
|Endometrial||173||136 (79%)||14 (8.1%)||23 (12%)|
|Other||21||17 (81%)||1 (5%)||3 (14%)|
Table 1. Analysis of MSI in various tumor types. DNA from FFPE slices mounted on slides from the indicated tumor samples and adjacent normal tissue was extracted and analyzed using the TrueMark MSI Assay and TrueMark MSI Analysis Software. The TrueMark MSI Analysis Software classified the tumors as MSS if there were no loci that showed instability, MSI-L if there were 1–4 loci showing instability, or MSI-H if there were more than 4 loci showing instability.
One widely used method for studying inherited human diseases arising from variations in copy number of a locus is multiplex ligation–dependent probe amplification. This method, developed and commercialized by MRC Holland, can analyze up to 50 multiplexed pairs of adjacently located probes hybridizing to the loci of interest. The high dynamic range, sizing precision, and peak-height fidelity necessary for analyzing MLPA probe amplicons make the SeqStudio system an ideal platform for performing MLPA analyses. Results obtained on the SeqStudio instrument are compatible with MRC Holland’s Coffalyzer.Net software for analyzing MLPA data.
MLPA on the SeqStudio instrument was used to analyze a DNA sample from a probe that is known to carry a duplication of exons 2–30 in the Duchenne muscular dystrophy (DMD) gene and a normal sample using the P034 DMD assay set from MRC Holland. The peak heights and relative sizes of these samples can readily be translated into an accurate detection of the region containing the duplication (Figure 1). Similar results were obtained using probes for large and small deletions. Therefore, the SeqStudio instrument can be an integrated tool for MLPA investigations of regions containing CNVs.
One of the most common applications of Sanger sequencing is the analysis of inserts subcloned into plasmids. Applied Biosystems BigDye chemistries are widely used for Sanger sequencing and an integral part of plasmid sequencing workflows. Several of the new features on the SeqStudio platform offer benefits to researchers performing basic plasmid sequencing methods. The instrument is preloaded with sequencing modules optimized for short (<300 bp), medium (500 bp), and long (>600 bp) read lengths, and can also be customized on the instrument to meet specific needs. The swappable cartridges can be associated with individual projects and users. The cloud-based Sanger Quality Check application provides an intuitive set of tools to analyze sequencing traces. Finally, the cloud connectivity for remote monitoring, accessing, and sharing sequencing information can help collaborators rapidly analyze the same data sets.
The performance of the SeqStudio instrument for plasmid sequencing was determined by sequencing the pGEM7zf+ plasmid with M13 primers and Applied Biosystems BigDye Terminator v3.1 chemistry. Results were obtained by analyzing the sequencing traces using the Sanger Quality Check module on the Thermo Fisher Cloud (Figure 1). In the example shown, the same plasmid was sequenced in 16 wells and analyzed on the SeqStudio Genetic Analyzer in 4 different injections. Note that the trace score, peak under peak (PUP) values, contiguous read length (CRL), and QV20+ (length with quality values >20) are similar for each sample. Similar results were obtained in traces on the other strand, and in other experiments by using Applied Biosystems BigDye Terminator v1.1 chemistry. These data demonstrate that the SeqStudio platform can generate plasmid sequencing results of very high quality.
Figure 1. Analysis of sequencing quality using the Sanger Quality Check Cloud app. (A) Once a run is completed, the SeqStudio instrument displays the resulting sequence file as well as the quality scores for each base. (B) Sixteen separate pGEM7zf+ sequencing reactions were run on the SeqStudio instrument and the .ab1 files were uploaded to the cloud and analyzed. Note that the sequencing metrics were very similar in the sixteen different reactions. CRL = contiguous read length, QV20+ = number of nucleotides with a quality value >20.
Repeat DNA expansion is the term given to a DNA mutation comprising any number of multi-nucleotide repeats. Often, these regions are challenging to amplify—and thus, characterize—because of high GC content. Short tandem repeats (STRs) are one category of repetitive DNA. They contain bursts of one to six nucleotides repeated over long stretches of genomic DNA and are crucial in understanding more than 30 debilitating genetic diseases.
Fragment sizing with CE is especially compatible with revealing repeat sections of DNA associated with disease. Often, these regions are particularly challenging to amplify and analyze because of high GC content. Therefore, the SeqStudio Genetic Analyzer was evaluated to establish and verify recommended injection and run settings for use with Asuragen AmplideX™ PCR/CE FMR1 and C9orf72 reagents, in order to successfully analyze these challenging regions. FMR1 is the gene that, when expanded, causes fragile X syndrome. Similarly, C9orf72, when expanded, can cause either frontotemporal dementia, amyotrophic lateral sclerosis (ALS), or both.
AmplideX PCR/CE reagents leverage repeat-primed PCR and CE to report both full-length amplicons that encompass the STR region and repeat-primed products that offer confirmatory peak patterns of pathogenic alleles. This approach has been well established on Applied Biosystems 3130, 3500, and 3730 CE platforms and described in over 75 publications since 2010. However, the shorter array and POP-1 gel polymer used by the SeqStudio CE instrument require optimized run settings to match the performance of AmplideX PCR reagents on previous instruments. To this end, a series of experiments were conducted to determine and verify specific run settings for the SeqStudio instrument to align with the performance requirements for the AmplideX PCR/CE reagents. Priority experiments were designed to determine run parameters for long-fragment assays, such as FMR1 and C9orf72 from Asuragen, that require sizing resolution exceeding 800 base pairs. The easy-to-use SeqStudio system, together with the repeat-proficient AmplideX PCR reagents, allows genotyping of the elusive GC- and AT-rich repeat expansions.
The FragAnalysis or LongFragAnalysis modules on the SeqStudio system had a comparable impact on signal intensity. However, while higher signal intensities were obtained using the FragAnalysis module (approximately 50% greater in signal intensity), a split-peak profile was observed for signal-saturated gene-specific peaks. The commonality of these peaks for both FMR1 and C9orf72 warranted use of LongFragAnalysis as the base module, using 2-sec injections, 6 kV injection and run voltages, and a run time of 3,300 sec (55 min). These conditions yielded a profile similar to that of the 3500xL instrument and were used in the subsequent analysis of precision and sensitivity (Figure 1).
Multiplexed qPCR solutions test for small numbers of pathogens and their relatively small capacity can limit throughput when large numbers of targets or pathogens need to be detected. Fragment analysis by capillary electrophoresis (CE) can be used to test multiple pathogens associated with different syndromes in a single sample.
We describe a simple workflow in which multiple amplicons from a pathogenic virus can be analyzed using fragment analysis (Figure 1). To illustrate this approach, we analyze targets derived from SARS-CoV-2 and show how positive-control synthetic DNA sequences can be used to define and resolve the fragment sizes of the target amplicons. We illustrate the sensitivity of the method using known quantities of the SARS-CoV-2 RNA genome.
Application note: Multiplexed target fragment analysis for detection of viral pathogens, including SARS-CoV-2
Protocol: Detection of RNA from SARS-CoV-2 using fragment analysis
Learn more about Sanger sequencing and fragment analysis solutions for SARS-CoV-2 research
Briefly, for this application, after identifying suitable target sequences, amplicons should be designed to be 150–500 nucleotides long. Each amplicon should have a unique length, differing from other amplicons by at least 5 nucleotides. The forward primer should be labeled at the 5´ end with the fluorescent dye, and the reverse primer should be unlabeled. Reverse transcription polymerase chain reactions (RT-PCRs) are then set up using a system that allows for cDNA synthesis followed by endpoint PCR in a single tube (e.g., TaqMan Fast Virus 1-Step Master Mix). The nucleic acid samples are subjected to PCR and the resulting fragments separated by CE. The appearance or absence of pathogen-specific fragments of the expected sizes determines the presence or absence of the pathogen in the sample (Table 1). The height of the peaks can be a semi-quantitative indicator of the abundance of the targets in a sample.
|Table 1. Suggestion for interpreting peaks arising from analysis of SARS-CoV-2 sequences by fragment analysis. Similar guidelines have been used to define detection in a qPCR-based SARS-CoV-2 test. Parameters for other detection assays may be defined according to the specific testing needs and characteristics of the pathogens of interest.|
|Xeno RNA control peak||SARS-CoV-2 RNA peaks||Interpretation results|
|Present||None||Negative—no SARS CoV-2 RNA detected|
|Present||3 peaks||Positive—SARS CoV-2 RNA detected|
|Present||2 peaks||Positive—SARS CoV-2 RNA detected|
|Present||1 peak||Indeterminate—retest the same purified sample|
|No peak||No peak||Invalid—no SARS-CoV-2 RNA detected; retest the same purified sample|
The ability to detect single-nucleotide polymorphisms (SNPs) plays a critical role in understanding how the genome influences biological phenotypes. To analyze SNP variants, the Applied Biosystems SNaPshot Multiplex System was developed. Customizable, color-coded fragments of differing sizes, corresponding to specific alleles, are analyzed by fragment analysis. The SeqStudio system includes new features that facilitate SNaPshot analysis, including built-in reporting of fragment analysis results of size and peak area. Additionally, the ability to mix fragment analysis and sequencing reactions on one plate enables investigators to perform SNP profiling and Sanger sequencing on a single run.
To illustrate the functional utility of the SeqStudio instrument in SNaPshot workflows, genomic DNA from FFPE-preserved tumor slices was collected and analyzed using probes targeting KRAS G12X and G13X alleles using the SNaPshot multiplex reagent kit. The SeqStudio instrument produced results that clearly showed the presence and accurate calls of the different alleles at this position (Figure 1). Note that although the detection of the alleles was accurate on SeqStudio instrument, the absolute migration of all peaks will differ slightly when compared to that in other platforms due to the different chemical nature of the different polymers. Therefore, to associate a peak with an allele without an ambiguity, a calibration with known alleles should be performed before undertaking a large-scale analysis.
- Introduction to SeqStudio applications
- Extended RAS Research Assay on the SeqStudio (Oncology Research)
- Enabling neurological disease research via DNA fragment analysis on the SeqStudio
- Genome editing workflow facilitated by the Thermo Fisher Scientific portfolio solution
- A Complete Workflow for Human Cell Line Authentication
SeqStudio customer spotlights
- Matching identities of iPSCs and donors using CLA Identifiler STR profiling kits
- SeqStudio Speed and Accuracy for Inherited Disease Research
- SeqStudio for Translational Research at the University Hospital of Basel
- Shedding Light on Missing Heritability: SeqStudio Fragment Analysis for Neurological Disease Research
For Research Use Only. Not for use in diagnostic procedures.