DNA Sequencing Frequently Asked Questions
1. What is Sanger Dideoxy Sequencing and how is it different from Applied Biosystems Fluorescent Sequencing?
With Sanger Sequencing, DNA polymerases copy single-stranded DNA templates, by adding nucleotides to a growing chain (extension product). Chain elongation occurs at the 3' end of a primer, an oligonucleotide that anneals to the template. The deoxynucleotide added to the extension product is selected by base-pair matching to the template.
The extension product grows by the formation of a phosphodiester bridge between the 3'-hydroxyl group at the growing end of the primer and the 5'-phosphate group of the incoming deoxynucleotide (Watson et al., 1987). The growth is in the 5' -> Â 3' direction (see Figure 1).
DNA polymerases can also incorporate analogues of nucleotide bases. The dideoxy method of DNA sequencing developed by Sanger et al. (1977) takes advantage of this ability by using 2',3'-dideoxynucleotides as substrates. When a dideoxynucleotide is incorporated at the 3' end of the growing chain, chain elongation is terminated selectively at A, C, G, or T because the chain lacks a 3'-hydroxyl group (see Figure 2).
2. What is Cycle Sequencing?
Cycle sequencing is a simple method in which successive rounds of denaturation, annealing, and extension in a thermal cycler result in linear amplification of extension products (see Figure 3). The products are then loaded onto a gel or injected into a capillary. All current BigDye sequencing kits use cycle sequencing protocols.
3. What are the advantages of cycle sequencing?
The advantages are as follows:
- Protocols are robust and easy to perform
- Cycle sequencing requires much less template DNA than single-temperature extension methods
- Cycle sequencing is more convenient than traditional single-temperature labeling methods that require a chemical denaturation step for double-stranded templates
- High temperatures reduce secondary structure, allowing for more complete extension
- High temperatures reduce secondary primer-to-template annealing
- The same protocol is used for double- and single-stranded DNA
- The protocols work well for direct sequencing of PCR products
- Difficult templates, such as bacterial artificial chromosomes (BACs), can be sequenced
4. What is Dye Terminator Cycle Sequencing?
With dye terminator labeling, each of the four dideoxy terminators (ddNTPs) is tagged with a different fluorescent dye. The growing chain is simultaneously terminated and labeled with the dye that corresponds to that base (see Figure 4). An unlabeled primer can be used. Dye terminator reactions are performed in a single tube. They require fewer pipetting steps than dye primer reactions. Four-color dye labeled reactions are loaded in a single gel lane or capillary injection. False stops, i.e., fragments that are not terminated by a dideoxynucleotide, go undetected because no dye is attached.
5. What is Dye Primer Cycle Sequencing?
With dye primer labeling, primers are tagged with four different fluorescent dyes. Labeled products are generated in four separate base-specific reactions. The products from these four reactions are then combined and loaded into a single gel lane or capillary injection (see Figure 5). Dye primer chemistries generally produce more even signal intensities than dye terminator chemistries.
Labeled primers are available for common priming sites. Custom primers can also be labeled. Four-color dye-labeled reactions are loaded onto a single lane or capillary injection.
6. What are Matrix Standards?
The precise spectral overlap between the four dyes is measured by running DNA fragments labeled with each of the dyes in a special calibration run on the Applied Biosystems Genetic Analyzer. These dye-labeled DNA fragments are called matrix standards.
The Data Utility software then analyzes the data from the matrix standard samples and creates a matrix file. These numbers are normalized fluorescence intensities and represent a mathematical description of the spectral overlap that is observed between the dyes.
The matrix files in an instrument file are used for specific types of chemistry, and provide information to the Sequencing Analysis software to allow it to correct for spectral overlap.
7. What is De Novo Sequencing?
The initial generation of the primary genetic sequence of a particular organism is called de novo sequencing. A detailed genetic analysis of any organism is possible only after de novo sequencing has been performed. de novo sequencing is typically accomplished by assembling individual sequence reads into longer contiguous sequences (contigs) or correctly ordered contigs (scaffolds) in the absence of a reference sequence
8. What is Resequencing?
Resequencing specific genomic regions is commonly performed to indentify the mutations and changes in genes. Resequencing techniques can be focused on known mutations (genotyping) or used to search for any mutation in the target DNA region (variant analysis).
9. What is SNP analysis?
Single nucleotide polymorphism (SNP) is the substitution of one base for another. They are common DNA variants present across the human genome and have been shown to be responsible for differences in genetic traits, susceptibility to disease, and response to drug therapies. Genotyping of SNPs has become extremely important to researchers working to understand and treat disease. SNPs occur approximately once every 100 to 300 bases and can be detected by various different techniques such as sequencing, SNapShot, and more.
10.What is Heterozygote Detection?
A diploid organism is heterozygous at a gene locus when its cells contain two different alleles of a gene. Heterozygotes are essentially detected by sequencing (SNP and small deletion-insertion) or gene copy number (big deletion-insertion).
11. What is BAC End-Sequencing?
Bacterial artificial clones (BACs) are large segments (100kb-200kb) of DNA cloned into bacteria from another species. Multiple copies can be made after cloning. Sequences from the BAC ends provide highly specific markers. These sequences can then be queried against BAC libraries for confirmation.
12. What does "Checking Clone Constructs" mean?
This refers to verifying that the DNA of interest has been properly cloned into the vector by sequencing.
13. What is Multicomponent Analysis?
Multicomponent analysis is the process that separates the four different fluorescent dye colors into distinct spectral components. Although each of these dyes emits its maximum fluorescence at a different wavelength, there is some overlap in the emission spectra between the four dyes (see Figure 6). The goal of multicomponent analysis is to isolate the signal from each dye so that there is as little noise in the data as possible.
14. What is Multi-Locus Sequence Typing (MLST)?
MLST is an unambiguous procedure for characterizing bacterial isolates using the sequence of internal fragments from 7 housekeeping genes. The procedure uses internal fragments (~450-500 bp) of each gene, as they can be accurately sequenced on both strands with an automated Genetic Analyzer. For each housekeeping gene, the sequences present within a bacterial species are assigned as distinct alleles, and for each isolate the alleles at each of the seven loci define the allelic profile or sequence type (IST).
15. What is HLA Typing?
The human leukocyte antigen test (HLA) detects antigens (genetic markers) in white blood cells. The 4 types of human leukocyte antigens are: HLAA, HLAB, HLAC, and HLAD. The HLA test checks the tissue compatability and recipient / donor tissue typing. It is also used in genetic counseling and paternity testing. (Research only.)
16. What is Methylation Detection?
DNA methylation occurs at CpG sites, which are DNA sequences in which cytosine lies next to guanine. Methylation is mediated by an enzyme (DNA methyltransferase). CpG sites are rare in an eukaryotic genome, except in regions near the promoter of a gene. These regions are known as CpG islands, and the state of methylation at these CpG sites is critical for gene activity / expression.
17. What is mtDNA Sequencing?
The common abbreviation for mitochondria is mtDNA. Mitochondiral molecules are present in 100s-1000s of copies per cell, as opposed to the nuclear DNA, which is present in just two copies per cell. The abundance of mtDNA allows discrimination among individuals or biological samples, particularly if nuclear DNA is degraded or unavailable.
18. What is Comparative Genomic Resequencing?
Comparative Genomic Resequencing is the comparison of genomes and individuals within a genome. Comparative genomics makes possible the application of information gained from a sample genome to a more complex genome. It is the basis for the understanding of genetic variation in a population.
19. What is the SAGE™Method?
SAGE™ is a method for quantitative, genome-wide gene expression pattern analysis. A short sequence tag (10-25 bp) contains sufficient information to identify a transcript.