High-throughput Digital Gene Expression Analysis

  • SOLiD® SAGE™ Kit with Barcoding Adaptor Module
  • Limited data for analysis required by focusing on 3’ transcripts
  • Greater mappability using the included SOLiD® SAGE™ software package
  • Compatible with SOLiD® RNA Barcodes for multiplexing capabilities

Digital profiling using the SOLiD® SAGE™ System

The SOLiD® SAGE™ Kit with Barcoding Adaptor Module provides a complete solution to perform digital gene expression profiling on the SOLiD® System. Instead of sequencing an entire transcriptome, the SOLiD® SAGE™ Kit generates 3’ expression tags—averaging 27 base pairs in length—enabling digital enumeration of gene expression profiles on the SOLiD® 4 System. Capturing only the 3’ end of transcripts allows for a cost-effective, highly sensitive elucidation of lowly expressed transcripts as SAGE™ tags lead to the representation of a majority of RefSeq genes (Figure 1 - click to enlarge). For an increased cost-effective sequencing solution, libraries from the SOLiD® SAGE™ Kit with Barcoding Adaptor Module can also be molecularly barcoded using the included barcoding adaptor module and a SOLiD® RNA Barcoding Module (sold separately).

Figure 1. The majority of RefSeq genes in a sample represented with 2–5 million mapped SOLiD® SAGE™ tags. While the number of RefSeq genes covered continues to increase with the number of mapped reads, the incremental increase in genes covered diminishes.

SOLiD® SAGE™ 27 bp tag library generation

The SOLiD® SAGE™ Kit has been optimized to construct libraries with SAGE™ tags and SOLiD® specific adapters for pre-amplification and/or emulsion PCR directly. The kit includes reagents that generate longer tags (27 bp vs. 21 bp) compared to previous SAGE™ methods. In addition, incorporating the recommended protocol improvements allow for expression profiling with and without large scale PCR amplification (Figure 2).

After sequencing, the tags are mapped to a reference sequence database and differential expression is determined by counting the number of tags representing a digital enumeration of each unique transcript in the original sample. 

Figure 2. The SOLiD® SAGE™ library construction workflow. Bind purified total RNA to beads, which capture poly(A) RNA tails. Synthesize double-stranded cDNA in one tube using SuperScript® III Reverse Transcriptase and E. coli DNA polymerase. Digest the double-stranded DNA using a sequence-specific endonuclease and ligate Adapter A. If the multiplexing option is desired, instead, ligate barcoded Adaptor A using the included barcoding adaptor module and SOLiD® RNA barcodes. After ligation, the DNA is digested with a restriction endonuclease used as a tagging enzyme followed by ligation of Adapter B. Purify and PCR amplify if necessary, then proceed to emulsion PCR and SOLiD® sequencing, followed by analysis using SOLiD® SAGE™ software.

Sensitive gene expression using the SOLiD® SAGE™ Kit

Evaluation of expression differences in reference RNAs HBRR and UHRR was performed using real-time RT-PCR, bead microarrays, and the SOLiD® SAGE™ Kit on a subset of 25 diagnostic genes.  Higher correlation was observed between real-time RT-PCR and the SOLiD® SAGE™ Kit (R = 0.956) than between the real-time RT-PCR and microarray platform (R = 0.916) (Figure 3A, 3B), validating the accuracy of expression profiles from the SOLiD® SAGE™ Kit. Furthermore, the SOLiD® SAGE™ Kit demonstrated a higher pearson correlation at low expression levels.

Figure 3: Differential expression results obtained using the SOLiD® SAGE™ System correlate with results from real-time RT-PCR analysis significantly better than results from bead array analysis. 16 independent barcoded SOLiD® SAGE™ libraries were generated from Ambion® FirstChoice® Human Brain Reference RNA (HBRR) using SOLiD® RNA Barcodes (barcodes 1–8) and Universal Human Reference RNA (UHRR) (Stratagene) using SOLiD® RNA Barcodes (barcodes 9–16). The 16 libraries were then pooled by emulsion PCR and run on a single slide on the SOLiD® 4 System. Two libraries, consisting of a set of 25 genes, were analyzed to compare the differential expressionbetween the two samples [HBRR/UHRR]. (A) The SOLiD® SAGE™ HBRR sample was barcoded with barcode #6 and the universal sample was barcoded with barcode #16. The ratios of the differential expression of the 25 gene set between the two samples was compared to the ratios obtained by real-time RT-PCR. (B) Similarly, since the HBRR and UHRR RNAs were used as part of the MAQC (microarray quality control consortium) study, the ratios between the differential expression between the gene set found in that study using the micro-bead array (Illumina) was compared to the real-time RT-PCR results.

Sensitive Mappability

Massively parallel sequencing using the SOLiD® 4 System yields 2–4 million RefSeq-mapped tags per SOLiD® SAGE™ Kit reaction on 1/8 of a sequencing slide. The system generates data that is highly  reproducible and correlates very well with real-time RT-PCR results from the same sample. The percentage mapped using the new SAGE™ Software v1.10 to RefSeq (with genomic mapping capability) on the SOLiD® 4 System is on average four-fold greater than samples analyzed with the previous v1.06 SAGE™ mapping method (Figure 4).

Figure 4: Comparative library mapping using the new SAGE™ Software v1.10 to RefSeq (with genomic mapping capability) on the SOLiD® 4 System to the previous v1.06 SAGE™ mapping method. Eight replicate libraries were generated and barcoded [BC1–BC8] from Ambion® brain HBRR total RNA . Eight replicate libraries were also generated and barcoded [BC9–BC16] from Universal total RNA (Stratagene). The sixteen samples were then pooled prior to emulsion PCR and were run on a single slide on the SOLiD® 4 System. The samples were mapped to RefSeq using the old v1.06 software method, which uses the 27_0 parameter and on the new v1.10 software method using the 22_1 mapping parameter.