Echoing the excitement from more than a decade ago, Nature published two papers in its May 2014 edition revealing draft maps of the human proteome. Each paper was the result of years of team effort, the combined endeavor spreading across continents and among institutions.
Akhilesh Pandey’s Johns Hopkins University team (first author: Min-Sik Kim) published its results gained from mass spectrometry-based analysis of 30 histologically normal human samples.1 In addition to illustrating more than 84% coverage of the total annotated protein-coding human genes, the researchers have also made their results available in an interactive portal so that other researchers and interested Web users can explore the human proteome as it stands so far.2
Kim et al. (2014) pooled samples harvested from three individuals per tissue type and analyzed them by liquid chromatography-tandem mass spectrometry (LC-MS/MS) using high-resolution Orbitrap Elite and LTQ Orbitrap Velos hybrid ion trap-Orbitrap mass spectrometers (Thermo Scientific). They processed the raw peptide spectra obtained using Proteome Discoverer software (Thermo Scientific) before searching against the NCBI RefSeq human protein database with MASCOT and SEQUEST search engines. In some cases, the researchers further confirmed protein identities using Western blotting with appropriate antibodies. The team also employed a novel proteogenomic analytical strategy to increase the reach of their protein identification and gene annotation.
Following more than 2,000 LC-MS/MS runs, the researchers identified 293,000 non-redundant peptides from 25 million high-resolution MS/MS spectra. From this mass of data, they were able to conclusively identify proteins arising from 17,294 genes. This represents approximately 84% coverage of the total protein-encoding genes proposed in the human genome. Furthermore, the researchers found that they could ascribe proteins to 2,535 genes not previously categorized as coding for proteins.
Kim and colleagues found that almost half of the proteins identified by their methodology had not been deposited in the peptide resources Peptide Atlas and GPMDB. In addition to peptide discovery, the data also documented 2,861 protein isoforms from 2,450 genes and identified novel protein-coding regions previously designated as non-coding.
The team also began to document the protein products from 2,350 housekeeping genes, describing these as expressed ubiquitously in all tissues/samples analyzed. They propose that the proteins arising from these genes are primarily concerned with basic cellular functions, an idea shared with gene expression studies. The authors note that with the combination of vascular elements within tissues, the term “housekeeping proteins” must be clarified to show which are truly expressed in every cell type, as opposed to those that exist in every tissue.
Kim and colleagues conclude that their publication is indeed a draft and a beginning to a widespread and comprehensive map of the entire human proteome. They suggest that, as the draft develops, the combining of methodologies and enrichment techniques will improve sequence coverage in the future.
Reference and Note
1. Kim, M.-S., et al. (2014) “A draft map of the human proteome,” Nature, 509 (pp. 575–81), doi: 10.1038/nature13302.
2. Interactive Web portal available at http://www.humanproteomemap.org.
Post Author: Amanda Maxwell. Mixed media artist; blogger and social media communicator; clinical scientist and writer.
A digital space explorer, engaging readers by translating complex theories and subjects creatively into everyday language.