Following characterization, estimates show that between 92% and 94% of all human genes are capable of alternative splicing, a post-transcriptional modification that rearranges how the genetic code is read and turned into functional proteins. Alternative splicing increases proteome diversity; therefore, it makes sense to include these events in proteomic analyses. Such splicing events occur in both healthy and disease states. Products arising from disruption of the spliceasome often provide valuable biomarker evidence for disease diagnosis and progression.
Each alternative splicing event gives rise to a different protein isoform. The key to picking these up during proteome analysis is by correctly assigning peptide identity to the relevant splice product and site. Zhu and colleagues (2014) have developed SpliceVista, an algorithm that visualizes peptide identity in association with splicing events by pulling information from mass spectrometry (MS) proteomics data.1 The authors describe the process as a “splice variant-centered interrogation of shotgun proteomics data” and have made their tool publicly available.
In building SpliceVista, the team drew on previous work in developing another algorithm, PQPQ (peptide quantification by peptide quality control), which detects protein variants from shotgun proteomics MS data. PQPQ manages this by creating clusters based upon peptide spectrum matches (PSMs) and their quantitation, then analyzing the different patterns formed by differentially expressed protein isoforms.
The SpliceVista tool combines information from existing alternative splicing databases (EVDB and ECGene) to map the peptides identified during MS-based proteomics experiments to the splice variants. The algorithm, written using Python 2.7.2 software, identifies unique splice variant specific peptides (SVSPs), which can then be quantified and thus show alterations in alternative splicing events between experimental runs and as regards both disease and treatment.
Once constructed, Zhu et al. used their algorithm to interrogate two different data sets. They generated one of the data sets themselves, treating human epidermoid carcinoma cell line A431 with EGFR (epidermal growth factor receptor)-inhibitor gefitinib. Treated cells were lysed at 2, 6 and 24 hours post-treatment, subsequently undergoing trypsin digestion before incubation with 8-plex iTRAQ labeling reagents and followed by liquid chromatography–tandem mass spectrometry (LC-MS/MS) analysis using an LTQ Orbitrap Velos hybrid ion trap-Orbitrap mass spectrometer (Thermo Scientific).
The researchers’ other examined data set came from a comparison of induced pluripotent stem cells with their parent fibroblasts and embryonic stem cells; it was uploaded from an online repository.
The SpliceVista workflow comprises four stages:
1. data pre-processing to map PSMs to corresponding genes and to group the peptides
2. downloading of all known splice variants from each gene
3. mapping PSMs to their transcriptional positions within the gene
4. visualizing the data, allowing cluster analysis by PQPQ for quantitation
By working first with in silico analysis of all known human protein isoforms, Zhu et al. theoretically mapped tryptic and lys-C digests to their splice variants. Then, using the experimental data sets from the cell culture studies, the researchers were able to verify that SpliceVista correctly quantified alternative splice products arising from the different treatments or conditions.
SpliceVista reported 607 splice variants and 1,680 SVSPs in the A431 experiment. Further examination and comparison with RNA-seq data showed that SpliceVista correctly identified differential expression of certain splice variants. Key, the research team witnessed SpliceVista uncovering novel splice variants (157 novel peptides in the A431 study), which the team advised will need further validation to confirm identities.
The researchers also demonstrated that SpliceVista works well with analysis of existing data by downloading the stem-cell database from Proteome Xchange and identifying differentially regulated splice variants.
In conclusion, Zhu et al. believe that SpliceVista will be useful in detecting low-abundance splicing events, enabling investigation of the regulation of this post-transcriptional modification. Furthermore, they consider it will be a valuable tool in disease-specific biomarker discovery.
Reference
1. Zhu, Y., et al. (2014, June) “SpliceVista, a tool for splice variant identification and visualization in shotgun proteomics data,” Molecular and Cellular Proteomics, 13 (pp. 1552–62), doi: 10.1074/mcp.M113.031203.
Post Author: Amanda Maxwell. Mixed media artist; blogger and social media communicator; clinical scientist and writer.
A digital space explorer, engaging readers by translating complex theories and subjects creatively into everyday language.
Leave a Reply