To investigate the functions of RNA, RNA is routinely converted to more stable complementary DNA (cDNA) by reverse transcription (RT). cDNA allows further manipulations to study RNA using DNA-based techniques such as cloning, PCR, and sequencing, so reverse transcription is a crucial step in many RNA-based experimental workflows.

On this page:

Reverse transcription polymerase chain reaction (RT-PCR)

In RT-PCR, an RNA population is converted to cDNA by reverse transcription (RT), and then the cDNA is amplified by the polymerase chain reaction (PCR) (Figure 1). The cDNA amplification step provides opportunities to further study the original RNA species, even when they are limited in amount or expressed in low abundance. Common applications of RT-PCR include detection of expressed genes, examination of transcript variants, and generation of cDNA templates for cloning and sequencing.

Figure 1. Reverse transcription polymerase chain reaction (RT-PCR). RT = reverse transcription, RTase = reverse transcriptase.

Since reverse transcription provides cDNA templates for PCR amplification and downstream experiments, it is one of the most critical steps for experimental success. The reverse transcriptase selected should offer the highest efficiency even with challenging RNA samples, such as those that are degraded, have carryover inhibitors, or possess a high degree of secondary structure.

In performing RT-PCR, one-step and two-step methods are the two common approaches, each with its own advantages and disadvantages (Figure 2). As the name implies, one-step RT-PCR combines first-strand cDNA synthesis (RT) and subsequent PCR in a single reaction tube. This reaction setup simplifies workflow, reduces variation, and minimizes possible contamination. One-step RT-PCR allows easier processing of large numbers of samples, making it amenable to high-throughput applications. However, one-step RT-PCR uses gene-specific primers for amplification, limiting the analysis to a few genes per RNA sample. Since the reaction is a compromise between reverse transcription and amplification conditions, one-step RT-PCR could be less sensitive and less efficient in some scenarios. Nevertheless, use of a gene-specific primer in RT-PCR can help maximize the yield of the target cDNA and minimize background amplification.

Two-step RT-PCR entails two separate reactions, beginning with first-strand cDNA synthesis (RT), followed by amplification of a portion of the resulting cDNA by PCR in a separate tube. Therefore, two-step RT-PCR is useful for detecting multiple genes in a single RNA sample. The separation of RT and PCR reactions allows for optimization of reaction conditions for each step, as well as flexibility with reverse transcription priming (oligo(dT) primers, random hexamers, or gene-specific primers) and PCR setup (e.g., DNA polymerase choice and PCR components). Compared to one-step RT-PCR, the disadvantages of two-step RT-PCR include multiple steps for an extended workflow, additional sample handling and processing, and increasing the chance of contamination and variation in results.

Table 1. Comparison of one-step and two-step RT-PCR

  One-step RT-PCR Two-step RT-PCR
Setup Combined reaction under conditions that support both reverse transcription and PCR Separate optimized reactions for reverse transcription and PCR
Primers Gene-specific primers Choice of oligo(dT), random hexamers, or gene-specific primers
Ideal use Analysis of one or two genes; high-throughput platforms Analysis of multiple genes
Advantage Convenient, high-throughput Flexible

Quantitative RT-PCR (RT-qPCR)

One of the most common applications of quantitative RT-PCR (RT-qPCR) is quantitative analysis of mRNA levels over time, across cells and tissues, or after an event (e.g., drug treatment). Due to higher sensitivity than RT-PCR, RT-qPCR is also widely used to examine the presence of retroviruses (RNA viruses) in research samples. Similar to the RT-PCR workflow, RNA is first converted to cDNA, which is then amplified by PCR. The main difference, however, is that levels of amplified cDNA are measured by fluorescence in real time during the exponential phase of amplification. The amplification level is used as a basis to quantitate the original targets within the RNA population. (Learn more about quantitative PCR)

The accuracy of the quantitation of gene expression by RT-qPCR depends heavily upon the quality and quantity of cDNA templates. Thus, the reverse transcription step is critical for success in RT-qPCR. The reverse transcription step should generate cDNA products that are representative of the original RNA population. The reverse transcriptase selected should therefore be able to synthesize cDNA efficiently, even with low-abundance genes and suboptimal and/or challenging RNA samples (e.g., high GC%, inhibitor presence, degradation). (Learn more about reverse transcriptase attributes)

In addition to a highly efficient reverse transcriptase, there are a number of considerations in choosing reagents for the RT reaction. First, the dynamic range or linear amplification of cDNA over a broad range of input RNA is critical. The ability to obtain cDNA yields proportional to the amounts of input RNA ensures accurate quantitation of gene expression (Figure 3).

Figure 3. Linearity of qPCR results subsequent to using RT master mixes across a range of total RNA input, for detection of (A) high-abundance and (B) low-abundance RNA targets. RNA input, ranging from 10 pg to 1 μg, was reverse-transcribed and subsequently amplified by PCR. Both master mixes generated cDNA proportional to the input RNA, but a higher yield was obtained from Master Mix 1 as indicated by lower (i.e., earlier) Ct values, especially with the low-abundance gene target.

Furthermore, the reagents selected should produce abundant and consistent cDNA yields among replicates in order to obtain gene expression results with high sensitivity and little variability (Figure 4). A single-tube master mix containing all necessary components for reverse transcription helps minimize experimental variation, cross-contamination, and pipetting errors. (Learn more about reverse transcriptase for optimal RT-qPCR)

Figure 4. Sensitivity and variability of qPCR results subsequent to using different RT master mixes, to detect (A) high-abundance and (B) low-abundance RNA targets. Among the reagents, Master Mix 1 produces qPCR results with the lowest average Ct and standard deviation from 30 experimental replicates, demonstrating the importance of reverse transcription reagent choice for reliable gene expression analysis.

One special procedure of RT-qPCR is direct reverse transcription from crude cell lysates without RNA isolation [1]. In experiments focusing on rare cells or events, using scarce samples, or selecting specific cells within populations, direct RT-qPCR may be considered to prevent potential sample loss and low RNA recovery. In the direct procedure, it is critical to inhibit endogenous RNases that would degrade RNA and to remove cellular genomic DNA during cell lysis. With optimized kits, sample preparation can be completed in as little as 7 minutes while providing signals from only a single cell. Highly processive reverse transcriptases are especially suited for reverse transcription of unpurified RNA extracts, because of their resistance to inhibitors and high sensitivity.

cDNA cloning and library construction

One of the first applications of reverse transcriptase in molecular biology was the construction of cDNA libraries [2-4]. A cDNA library consists of cDNA clones that represent the transcribed sequences within a specific sample. Therefore, a library provides information about the temporal and spatial expression of genes for a given cell type, organ, or developmental stage, for example. The cDNA library clones are used in the characterization of novel RNA transcripts, determination of gene sequences, and expression of recombinant proteins.

Essential in constructing cDNA libraries is the proper representation of RNAs in their full length and/or their relative abundance, making the selection of a reverse transcriptase extremely important. Highly processive reverse transcriptases are capable of synthesizing long cDNAs as well as capturing low-abundance RNAs. Similarly, reverse transcritpases with increased thermostability are recommended for reverse-transcribing RNA with a high degree of secondary structure. (Learn more about reverse transcriptase attributes)

After reverse transcription, a number of approaches may be used to insert cDNA into a vector for cloning. The double-stranded cDNAs after second-strand synthesis often have blunt ends and can be cloned into blunt-ended vectors (Figure 5A). Although this approach involves fewer steps, blunt-end cloning may result in less efficient ligation and loss of directionality after insertion. (Learn more about cloning workflow)

Alternatively, cDNA ends may be modified to include additional nucleotides of known sequences. For example, to modify the 5′ end of cDNA, oligo(dT) primers with additional 5′ nucleotides can be used to initiate reverse transcription; to modify the 3′ end, short DNA oligos called linkers or adapters with desired sequences may be ligated (Figure 5B). In this manner, sites for directional insertion (e.g., restriction and homologous recombination), promoter binding (e.g., T3 and T7 sequences), and affinity purification (e.g., biotin and His tags) can be readily incorporated into the cDNA sequence. (Learn more about DNA library construction)

Figure 5. Common methods to clone cDNA. (A) Double-stranded cDNA with blunt ends may be cloned directly into a blunt-end cloning vector. (B) For directional cloning, cDNA ends can be modified with unique sequences compatible with the vector. (C) Ligation-independent cloning may be performed with complementary terminal sequences to enhance the efficiency of insertion. (D) Gene-specific cloning via PCR may be considered when the insert’s sequence is known.

In another popular strategy, the 3′ ends of cDNA inserts and vectors are enzymatically extended with complementary homopolymeric tails. Using terminal deoxynucleotidyl transferase (TdT) and a single dNTP, a string of 20–30 nucleotides can be added to an insert, and a similar string of complementary nucleotides added to a vector (e.g., Cs on the insert and Gs on the vector), enabling the vector and insert tails to anneal to each other (Figure 5C). Ligation is not required because the gaps are repaired inside the bacteria after transformation.

When the target sequence is known, the insert may be generated by RT-PCR for cloning of a specific region of a cDNA (Figure 5D). (Learn more about PCR cloning)

Rapid amplification of cDNA ends (RACE)

Rapid amplification of cDNA ends (RACE) is a PCR-based method for determining unknown sequences at the 5′ and 3′ ends of cDNA [5]. These methods are commonly known as 5′ RACE and 3′ RACE, respectively. The experimental goals of RACE include identification of 5′ and 3′ untranslated regions, investigation of heterogeneous transcriptional start sites, characterization of promoter regions, determination of complete cDNA sequences, and sequencing of complete open reading frames (ORFs) for protein expression.

PCR with single-sided specificity (also known as one-sided or anchored PCR [6,7]) is employed to amplify the unknown regions of cDNA as RACE products. 5′ RACE relies on extension of the 5′ end with an oligonucleotide for PCR primer binding, while 3′ RACE takes advantage of the poly(A) tail of mRNA as a generic priming site for PCR (Figure 6).

In 5′ RACE (Figure 6A), mRNA of a specific sequence or related family is reverse-transcribed into the first-strand cDNA using a gene-specific primer. The 3′ end of the cDNA is then extended with a homopolymeric tail (usually a string of Cs) by terminal deoxynucelotide transferase (TdT), or is ligated to an oligonucleotide adapter. Thereafter, two rounds of semi-nested PCR are performed to amplify the region with the 5′ unknown sequence. PCR also allows end-extension of amplicons via primers for downstream applications, such as restriction site introduction for directional cloning and universal sequencing primer binding sites for sequencing.

In 3′ RACE (Figure 6B), mRNA is reverse-transcribed to cDNA using an oligo(dT) primer with an adapter sequence. Two rounds of semi-nested PCR are then performed using primers specific to known upstream exon sequences and the adapter sequences introduced through the oligo(dT) primer. In this manner, unknown 3´ mRNA sequences between the exons and the poly(A) tail are amplified for further analysis.

The quality of input RNA and setup of the reverse transcription reaction are critical for successful RACE experiments. In 5′ RACE, first-strand cDNAs of any length (i.e., even those not reaching the 5′ end of the mRNA) will possess the added sequence (i.e., homopolymeric tail or adapter) and subsequently be amplified in PCR. To maximize full-length cDNA synthesis, reverse transcriptases with minimal RNase H activity, high processivity, and high thermostability should be selected. (Learn more about reverse transcriptase attributes)

Alternatively, designing the gene-specific primers to bind close to the 5′ end of the mRNA can facilitate the capture of the unknown 5′ end sequence, due to the shorter distance to reverse-transcribe during the cDNA synthesis step. Similarly, a procedural modification can be considered to select RNA with a 5′ 7-methylguanosine (7mG) cap that represents mature, full-length eukaryotic mRNA for reverse transcription (Figure 7) [8,9].

Figure 7. Modified 5′ RACE to help capture sequences from 7mG-capped mRNA. First, uncapped RNA is eliminated from the RACE pool by using alkaline phosphatase (AP), to remove the free 5′ phosphate group and prevent ligation. Then, the remaining capped RNA is treated with tobacco acid pyrophosphatase (TAP) to remove the cap and expose the 5′ monophosphate. An RNA adapter is then ligated to the exposed 5′ phosphate group and serves as a binding site for the forward PCR primers of 5′ RACE.

With 3′ RACE, full-length cDNA is not critical because sequences that are upstream of the PCR priming site are not amplified. Nevertheless, a reverse transcriptase that can generate long cDNA is preferred, since cDNA falling short of reaching the PCR primer’s binding site will not be represented in RACE analyses.

Gene expression microarrays

Development of DNA microarrays during the 1990s opened up large-scale profiling of gene expression without bias or prior hypothesis. Microarrays consist of thousands of chambers, known as “features” or “spots”, on glass or silicon wafers. Each feature contains, immobilized on its surface, identical copies of a single-stranded DNA sequence called a “probe”, which represent one gene. The probes hybridize to fluorescently labeled cDNA targets that are applied to the microarray, allowing simultaneous comparison of gene expression between two samples (Figures 8 and 9) [10-12].

Figure 8. Gene expression microarray chip.

Microarray probes are generated from known sequences of the genome or cDNA of an organism. For example, PCR can be used to make copies of every known gene, products of which are then denatured to single-stranded DNA and spotted onto a chip as immobilized probes. Alternatively, oligonucleotides of 20–60 nt can be synthesized directly on a chip as microarray probes [13].

Figure 9 gives an overview of how microarrays are used for differential gene expression analysis. First, total RNA or mRNA is isolated from two samples—experimental (also called “test” or “treated”) and control (also called “reference” or “normal”). The purified RNA samples are then converted to cDNA and labeled with different fluorescent dyes. Next, the labeled cDNA targets of both samples are mixed and allowed to hybridize to the probes on one microarray chip. After unbound targets are washed away, the microarray is scanned to detect the labeled fluorophores. The ratios of the two fluorescent signals are then analyzed to quantify expression of genes affected by the experimental conditions.

cDNA targets can be labeled either during or after reverse transcription (Figure 10). In direct labeling, fluorescently labeled nucleotides are incorporated during cDNA synthesis. Alternatively, in indirect labeling, nucleotides modified to enable conjugation may be used in reverse transcription, and then the cDNAs are subsequently labeled with fluorophores. Although the indirect method involves a longer workflow, the fluorescence labeling tends to be more efficient [14].

When lower amounts of input RNA are available (e.g., 10–100 ng), RNA may be reverse-transcribed to double-stranded cDNA using T7-oligo(dT) promoter primers. The subsequent cDNAs are then amplified by in vitro transcription (Figure 11). During in vitro transcription, RNA may be labeled directly or indirectly using modified ribonucleotides. Alternatively, amplified RNA may be reverse-transcribed and labeled to generate the cDNA targets [15].

Figure 11. Amplification of RNA by conversion to cDNA followed by in vitro transcription from an added promoter sequence.

In selecting a reverse transcriptase for preparation of cDNA targets for microarray experiments, the ability to obtain full-length cDNAs in high yields, even when RNA sequences have high GC content or secondary structure, is critical for good coverage of the RNA populations. Equally important, the reverse transcriptase must be able to incorporate modified nucleotides efficiently in order to ensure high signal-to-background ratios that enable accurate and unbiased detection of the input RNA populations. (Learn more about reverse transcriptase attributes)

RNA sequencing (RNA-Seq)

RNA sequencing, or RNA-Seq, is commonly performed to gain insight into RNAs transcribed from the genome and their regulation. With the advent of next-generation sequencing (NGS), RNA-Seq has become a high-throughput approach for analysis of the whole transcriptome (i.e., coding and long noncoding RNA species that have been transcribed), determination of gene expression, discovery of splice variants and fusion transcripts, and detection of low-abundance genes [16,17]. Advantages of RNA-Seq over microarrays include greater dynamic range, higher sensitivity, and the ability to characterize RNA sequences without prior genomic information.

Reverse transcription is involved in the preparation of templates for RNA-Seq, since most sequencing platforms are designed for DNA. It is desirable that the resulting cDNA population represent the original RNA population, including the low-abundance transcripts, with minimum bias. Full-length cDNA synthesis is also important, to capture all RNA sequences in the sample. The error rate of reverse transcription may be critical, depending on the sequencing library size and data quality. Therefore, the reverse transcriptase should be selected with careful consideration. (Learn more about reverse transcriptase attributes)

Research goals and sequencing technologies will dictate the order and method of RNA-Seq template preparation [18,19]. Nevertheless, a typical workflow for generating a library for sequencing includes enrichment of the RNA of interest, fragmentation of RNA or cDNA, reverse transcription, addition of sequencing adapters (and indices or barcodes, if multiplexing), and optional PCR amplification of the library (Figure 12).

Figure 12. Traditional workflow of RNA sequencing.

To enrich a sample for mRNA, ribosomal RNA (rRNA), which makes up about 80% of total RNA, is routinely depleted from the sample to improve the sequencing data for the transcriptome. Poly(A) tails are often present in eukaryotic mRNA and long noncoding RNA, so magnetic beads with covalently bound oligo(dT) offer an alternative strategy for effective enrichment of these mRNAs. In contrast, rRNA depletion is a preferred method to enrich prokaryotic mRNAs, since they do not have poly(A) tails that can be exploited to isolate them. For small RNAs (<200 nt), a size selection or specialized isolation method may be performed on the sample instead.

Fragmentation is performed before or after reverse transcription (i.e., on RNA or double-stranded cDNA), depending on the experimental goals and sequencing platforms (Figures 12, 13). Fragments of 200–500 nt are prepared for compatibility with NGS technologies to ensure high-quality reads. Means of fragmentation include mechanical (e.g., sonication, nebulization), chemical (e.g., hydrolysis), and enzymatic (e.g., RNase III, DNase I).

To examine the orientation or the sense/antisense property of the transcripts (called “strandedness”), RNA fragments may be manipulated prior to reverse transcription, such as by end-tagging with differentiating adapters. Alternatively, dUTP may be used in second-strand cDNA synthesis to specifically mark the complementary strand of the first-strand cDNA (Figure 13) [20].

Figure 13. Strand-specific RNA sequencing.

The adapter sequences at the fragment ends serve as attachments to the sequencing platform. These adapters are added directly or via specially designed primers during cDNA synthesis or amplification. In addition to the adapters, barcode or index sequences may be incorporated by PCR for sequencing of multiple samples at the same time (multiplexing). When the available starting amounts of RNA are low, PCR helps to generate adequate amounts of input cDNA for sequencing. (Learn more about PCR in sequencing)

For sequencing analysis, the transcriptome data may be assembled using either a genome-guided or de novo strategy, depending upon availability of the reference genome. The genome-guided approach maps the sequencing results to the known genome sequence, whereas the de novo strategy derives results by contig assembly, which requires extensive computing power [21]. (Learn more about shotgun sequencing)

As described in this section, reverse transcription is integral to the workflows of cDNA-based applications. It is important to choose a reverse transcription method and enzyme that are most applicable to your research objectives.

References

  1. Svec D, Andersson D, Pekny M et al. (2013) Direct cell lysis for single-cell gene expression profiling. Front Oncol 3:274.
  2. Okayama, H, Berg P (1982) High-efficiency cloning of full-length cDNA. Mol Cell Biol 2(2):161–170.
  3. Gubler, U, Hoffman BJ (1983) A simple and very efficient method for generating cDNA libraries. Gene 25(2-3):263–269.
  4. Harbers M (2008) The current status of cDNA cloning. Genomics 91(3):232–242.
  5. Frohman MA, Dush MK, Martin GR (1988) Rapid production of full-length cDNAs from rare transcripts: amplification using a single gene-specific oligonucleotide primer. Proc Natl Acad Sci U S A 85(23):8998–9002.
  6. Ohara O, Dorit RL, Gilbert W (1989) One-sided polymerase chain reaction: the amplification of cDNA. Proc Natl Acad Sci U S A 86(15):5673–5677.
  7. Loh EY, Elliott JF, Cwirla S (1989) Polymerase chain reaction with single-sided specificity: analysis of T cell receptor delta chain. Science 243(4888):217–220.
  8. Maruyama K, Sugano S (1994) Oligo-capping: a simple method to replace the cap structure of eukaryotic mRNAs with oligoribonucleotides. Gene 138(1-2):171–174.
  9. Schaefer BC (1995) Revolutions in rapid amplification of cDNA ends: new strategies for polymerase chain reaction cloning of full-length cDNA ends. Anal Biochem 227(2):255–273.
  10. Schena M, Shalon D, Davis RW et al. (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270(5235):467–470.
  11. Duggan DJ, Bittner M, Chen Y et al. (1999) Expression profiling using cDNA microarrays. Nat Genet 21(1 Suppl):10–14.
  12. Capaldi AP (2010) Analysis of gene function using DNA microarrays. Methods Enzymol 470:3–17.
  13. McGall GH, Christians FC (2002) High-density genechip oligonucleotide probe arrays. Adv Biochem Eng Biotechnol 77:21–42.
  14. Invitrogen Corp. (2003) Microarray target labeling you can trust. (Brochure)
  15. Invitrogen Corp. (2004) Comprehensive solutions for microarray analysis. (Brochure)
  16. Mortazavi A, Williams BA, McCue K et al. (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5(7):621–628.
  17. Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nature Rev Genetics 10(1):57–63.
  18. van Dijk EL, Jaszczyszyn Y, Thermes C (2014) Library preparation methods for next-generation sequencing: tone down the bias. Exp Cell Res 322(1):12–20.
  19. Hrdlickova R, Toloue M, Tian B (2016) RNA-Seq methods for transcriptome analysis. Wiley Interdiscip Rev RNA. doi: 10.1002/wrna.1364. [Epub ahead of print]
  20. Levin JZ, Yassour M, Adiconis X et al. (2010) Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat Methods 7(9):709–715.
  21. Kukurba KR, Montgomery SB (2015) RNA Sequencing and Analysis. Cold Spring Harb Protoc 2015(11):951–969.
Share