An article recently published in Molecular Cell describes how far proteomics has come in achieving complete characterization. The article defines complete characterization as “identification and quantification of at least one protein per genetic locus.” This goal was once thought to be too difficult; however, advances in large-scale tandem mass spectrometry (MS) have now made complete proteome characterization possible.1
Shotgun proteomics work began in the early 2000s.2 Since that time, mass spectrometers have become much more sensitive, with increased resolution. Advances in computational proteomics have also produced algorithms for label-free quantification, which makes it possible to efficiently interpret large data sets.
In 2008 the yeast proteome was successfully characterized and quantified using a targeted shotgun proteomics approach; however, these experiments required several months to complete.3 In 2012 the yeast proteome was characterized once again. Using a quadruple Orbitrap mass spectrometer (Thermo Scientific), samples were prepared in one step and the entire process took a little over a day to complete. As a result, 4,000 proteins were identified, which made up a nearly complete proteome.4
The goal now is to carry these principles over to the characterization of mammalian proteins, particularly for identifying and quantifying human proteins. Groups such as the Human Proteome Organisation (HUPO) are working chromosome by chromosome and have strategically assigned each chromosome to different countries.5
One problem impeding progress is the unavailability of proteomic technologies and resources. In a study published by members of the HUPO, 27 labs were given an equimolar protein sample containing 20 human proteins for analysis by MS.6 The MS data from all 27 labs were high enough quality to potentially identify all 20 proteins present in the sample; however, only 7 out of 27 labs were able to correctly identify the protein sample. According to the article, every lab surveyed was able to produce data sensitive enough for all 20 proteins to be discovered, but despite starting with good data, discrepancies occurred during the analysis of the MS data. Databases used for analysis were found to contain erroneous confidence levels, thereby missing several of the 20 proteins present in the sample.
Now that we have the technology available to produce sensitive MS data, complete proteome characterization is expected to increase our understanding of the global perspective of biological processes. It also has the potential to provide new information regarding disease states and systems biology studies. In order to reach the fullest potential, more needs to be done to fine-tune the analysis and verification of proteins.
1. Mann, M. (2013, Feb 21) “The coming age of complete, accurate, and ubiquitous proteomes,” Molecular Cell, 49(4) (pp. 583–590).
2. Aebersold, R., and Mann, M. (2003) “Mass spectrometry-based proteomics,” Nature, 422(6928) (pp. 198–207).
3. de Godoy, L.M., et al. (2008) “Comprehensive mass-spectrometry-based proteome quantiﬁcation of haploid versus diploid yeast,” Nature, 455(7217) (pp. 1251–1254).
4. Nagaraj, N., et al. (2012) “System-wide perturbation analysis with nearly complete coverage of the yeast proteome by single-shot ultra HPLC runs on a bench top Orbitrap,” Molecular & Cellular Proteomics, 11(3), M111.013722, doi: 0.1074/mcp.M111.013722.
5. Paik, Y.K., et al. (2012) “The Chromosome-Centric Human Proteome Project for cataloging proteins encoded in the genome,” Nature Biotechnology, 6(6) (pp. 221–223).
6. Bell, A.W., et al. (2009) “A HUPO test sample study reveals common problems in mass spectrometry-based proteomics,” Nature Methods, 6(6) (pp. 423–430).