Since it was first sequenced, the human genome has led researchers deeper and deeper into the genetics behind pathogenesis of many diseases. With faster and more accurate instruments, it is now possible to establish risk factors and disease severity for individual patients. The data also allow onward translation into personalized or precision medicine, with drug choice selection for maximum success and minimal side effects. However, around 5% of the genome is described as “dark” and remains mostly unknown in terms of function, sequence and importance.
Most of this “dark” DNA comprises repetitive sequences; this should be unsurprising, since repetitive sequences make up around 50% of the total human genome. At first, researchers dismissed these as filler or meaningless DNA, but now research shows that this genomic babble might exert strong influences on gene transcription and disease development.
Short tandem repeats (STRs) are one category of repetitive DNA. They contain bursts of between one and six nucleotides repeated over long stretches of genomic DNA. STRs are highly abundant throughout the genome and often exist in the regulatory elements controlling genes. Although STRs have been overlooked in the past, their importance for clinical research is starting to emerge. Clinical researchers are now finding out that these STRs could hold clues to not only development of disease, but also its severity. Furthermore, there is tissue-specific variation in STR content, with high abundancy recorded in the central nervous system.
Research has shown that variation in STR repeat length is multiallelic and a greater source of genetic variation than single nucleotide polymorphisms (SNPs). This suggests that STR repeat lengths might control functionality and influence gene expression, and therefore also potentially have a role in disease. Research has already found that variation in the length of DNA over which the STRs occur influences RNA splicing and epigenetic modifications such as methylation. This latter is now associated with Fragile X Syndrome, where variation in the number of STR CGG repeats in the FMR1 gene is directly associated with emergence of cognitive and neurological signs. Moreover, the number of CGG repeats is also important as a permutation analysis indicating carrier status.
Establishing STR repeat number and penetrance is therefore emerging as a valuable clinical research tool for investigating heritable neurological disease.
Developing Research Assays for Dark DNA: Not as Simple as You Think
For analysis, a researcher should be able to run sequencing assays that enumerate STRs of interest. In reality, this approach is problematic. Due to both the length of the STRs and the importance of establishing exact copy numbers, amplifying this dark DNA is not as straightforward as it sounds. To be reliable biomarkers of inherited disease, STR sequencing and amplification must be accurate, since copy number may influence future clinical decision making. Maintaining fragment analysis integrity and obtaining high-resolution data readouts requires a system that reliably amplifies these portions of the human genome.
This is where Asuragen’s AmplideX technology comes in, and where the SeqStudio Genetic Analyzer takes researchers a step further into the secrets held by dark DNA. Asuragen is a biotechnology company based in Austin, Texas that is driving the evolution of precision medicine in inherited disorders. They have developed research and diagnostic assays to characterize dark DNA, raising its potential for clinical research and helping to translate these advances into diagnostic technology.
SeqStudio Sheds Light on Dark DNA
Asuragen’s AmplideX assays provide custom digestion enzyme and PCR primers that amplify the dark DNA regions of interest. Submitting products to capillary electrophoresis (CE), such as in the SeqStudio Genetic Analyzer, bypasses Southern blotting for results and delivers data to researchers within hours rather than days.
However, for this to happen, the digestion and amplification steps must be optimized. Since this research relies on quantitative statements, it’s not enough for the primers to merely identify relevant sequences; amplification must faithfully capture the entire run of STR repetitions for research relevance. For this, Latham notes that the key is to pair PCR with high-performance analysis, using CE to generate definitive results. Moreover, tweaking assay run conditions ramps up signal veracity while diminishing false results to maximize the information derived from each sample—no repeat left behind.
SeqStudio offered Latham’s team the opportunity to set custom conditions within experimental runs so they could reliably amplify longer products and then discriminate differences highlighted by the data with confidence.
Since SeqStudio is a high-resolution platform, Latham found the subtle differences in signal intensities enabled discrimination within the data. Data readouts from color channels and size peaks meant that the research team could determine accurate repeat numbers and pick up mutations. Running the optimization assays on the SeqStudio platform allows researchers to specify run conditions that teams can apply on other platforms for wider research use, in keeping with Asuragen’s mission.
Data from SeqStudio runs have been excellent; Latham describes the performance as “consistent generation of high-definition signals and well behaved across multiple channels.” This means the team is able to differentiate single-base polymorphisms, resolve heterozygous from homozygous alleles, and gain valuable insight into methylation patterns.
For example, data generated from studies on the CGG repeats in Fragile X syndrome, often using starting materials as low as 40 ng of genomic DNA, yielded results that showed STR numbers indicative of carrier status (between 50 and 200 repeats). The data also showed pathognomic (more than 200) repeat levels.
In other studies, Asuragen researchers used SeqStudio CE to examine allele-specific methylation and analyze expansions of repetitive elements including poly-T repetitions, reliably quantifying them into the 1,000 bps range.
A Bright Future
These hidden elements within the human genome are increasingly discoverable via technology such as the SeqStudio Genetic Analyzer, raising their potential for clinical research and eventual translation into diagnostic technology.
Latham sees a bright future for dark DNA and the SeqStudio Genetic Analyzer. With its tunable run resolution and sensitivity, it is possible for clinical researchers to optimize assay conditions for each new STR analyte. From here, investigating heritability of neurological disease becomes faster and more efficient; Latham notes a one-hour run time to results, with the ability to increase efficiency through multiplexing on the instrument. SeqStudio CE versatility will be helping clinical researchers generate a wider range of genomic biomarkers for heritable neurological disease research in the near future.
Learn how SeqStudio Sanger sequencing contributes to Asuragen research into dark DNA and missing heritability in this webinar: Enabling Neurological Disease Research via DNA Fragment Analysis on the SeqStudio Genetic Analyzer or download the white paper
For research use only. Not for use in diagnostic procedures.