With rapid advances in technology, computational power is vital for interpreting the mass spectrometry (MS) data arising from proteomics research. This subject has been recently discussed in a Biochimica Biophysica Acta special edition, “Computational Proteomics in the Post-Identification Era.” In their expert review, Perez-Riverol and co-authors (2014) discuss the importance of optimal bioinformatics software selection and availability from the developer’s perspective.1
The review focuses on existing software libraries, open source frameworks and the free applications available for integration into the high-throughput MS workflow. Researchers identify peptides and proteins using database searches for sequences, de novo sequencing strategies and spectral library searches, each requiring dedicated software and algorithms. The authors comment that this common step in the workflow has led to the rise in software tools available, such as Mascot, Andromeda and Sequest.
After describing the typical tandem MS (MS/MS) workflow and the variations in approach, the authors comment that proteomics approaches frequently require quantification in addition to peptide and protein identification. In addition, researchers can discriminate among different states of a protein, measuring abundance of the protein isoforms with varying post-translational modifications and other anomalies. This ability is especially important in biomarker discovery, particularly for cancer and for monitoring its progression.
The review’s authors comment on the most widely used open source libraries and frameworks, including Open MS, Trans Proteomic Pipeline and Compomics, among others, describing their highlights and optimal application.
By delineating the analytical processes common to almost every MS analysis, Perez-Riverol and co-authors present the highlights of open source packages available at each step, commenting on programs available for:
- In silico database interrogation for protein and peptide sequences
- Data file conversion
- MS pre-processing
- Peptide and protein identification, including post-processing analysis
- Data storage, including public repositories
Following this logical flow, the authors review and contrast the open source software libraries available. They comment on suitability for certain experimental situations and include information on requirements for targeted selected reaction monitoring (SRM) analysis. Software suites discussed include MSQuant and ProteoSuite, among others.
Perez-Riverol and colleagues conclude that open source frameworks and libraries enable the development and growth of new tools for MS analysis. They believe that by simplifying basic functions necessary for the workflow, researchers then have the freedom to focus on developing new ways of conducting analysis and developing experimental methodology. They do warn, however, that although open source allows flexibility and quicker innovation, complete documentation may be lacking and therefore may hinder progress.
Software Libraries Reviewed
1. Perez-Riverol, Y., et al. (2014) “Open source libraries and frameworks for mass spectrometry based proteomics: A developer’s perspective,” Biochimica et Biophysica Acta, 1844 (pp. 63–76).
Post Author: Amanda Maxwell. Mixed media artist; blogger and social media communicator; clinical scientist and writer.
A digital space explorer, engaging readers by translating complex theories and subjects creatively into everyday language.