A recent study by researchers at The Centre for Applied Genomics and The Hospital for Sick Children in Toronto, Ontario, describes the use of exome sequencing on archived samples to identify a genetic defect common to group of unrelated subjects with a progressive form of combined immunodeficiency.
Genetic disease results from variation in the nuclear or mitochondrial genome. Changes that may occur include single-nucleotide variants (SNVs), small insertions or deletions (indels), and structural variants in which the anomalies could be a change in the number and/or structure of chromosomes, including copy number variants (CNVs) in which large regions (>1 kilobase (kb)) of the genome are duplicated (gains) or deleted (losses). These variants are inherited through the germline as preexisting variants or generated de novo prior to conception during formation of egg or sperm cells. Post-fertilization de novo mutation events may occur during the early stages of embryogenesis and lead to genetic mosaicism in tissue and organs. Evidence is emerging, primarily from exome sequencing studies, that de novo mutations might explain the heritability of complex genetic disease.
Samples from a previously diagnosed group of five subjects were selected for exome sequencing. These subjects displayed a common pattern of symptoms, such as a collection of autoimmune features, susceptibility to infections, as well as a progressive loss of T cell and B cell function indicating combined immunodeficiency.
Exome target capture was performed using the Ion TargetSeq™ Exome Kit, a solution-phase DNA probe capture technology for highly specific enrichment of exons and other targeted regions within the human genome. Sequencing was performed using the Ion Proton™ System with Ion PI™ Chip. An important consideration for variant discovery is coverage, i.e., the average number of reads that align to the reference. Coverage typically translates into confidence in variant calling, with greater coverage increasing the confidence in putative variants. Targeting efficiency was assessed by determining the coverage for all targeted bases with one exome library, resulting in an average read depth of 121x with 93.8% of the target bases covered at ≥20x.
For the above library, 27,166 SNVs and 10,175 indels were identified. It is critical that variants of interest are identified and prioritized, while common variants and non-deleterious variants (those predicted to have no effect on protein function or expression) are reliably filtered from exome sequencing experiments. To remove potential false positives, variants with <10 reads were removed with filters applied to remove synonymous and noncoding variants. To remove potential false positives, variants with 0.5 and SIFT of A substitution (p.T358M) in the STAT1 gene.
The p.T358M variant occurs in the DNA binding domain of STAT1, and the same variant was identified in three of the five samples, with a single sample possessing a p.I294T variant in STAT1 that also occurs in the DNA binding domain. The remaining sample had p.I294T variant in STAT1 located in the C-terminal end of the adjacent coiled coil domain. All of the STAT1 variants identified were de novo heterozygous variants present in the affected proband that were not present in either parent. The protein encoded by the STAT1 gene is a member of the STAT (Signal Transducer and Activator of Transcription) family, a group of transcription factors that play a key role in cytokine and growth factor signaling that is important for normal response to viral, bacteria, and fungal infections.
For further details, watch Christian Marshall describe their use of exome sequencing at The Hospital for Sick Children (Toronto, Ontario, Canada). or read the case study and harness the power of exome sequencing