Two research teams spanning three continents have independently published drafts of the human proteome. Each team standardized its proteomic research using Orbitrap mass spectrometers from Thermo Scientific.
“Mapping of the human proteome is a huge achievement. Thermo Fisher has been a longstanding pioneer in the development of technology for deep proteome mining, bringing to market a succession of breakthrough products that have revolutionized contemporary proteomics,” stated Ken Miller, Vice President of Marketing, Life Sciences Mass Spectrometry. He continued, “We have a long history of collaboration with both of the groups involved in the recent Nature studies—much of their work was done using Orbitrap mass spectrometers.”
The proteome maps, one published by a team led by Akhilesh Pandey at Johns Hopkins University (JHU) and the other by a team led by Technical University of Munich (TUM) researcher Bernhard Kuster, are among the most complete databases of the human proteome generated to date. The maps are now publicly available at https://www.proteomicsdb.org and www.humanproteomemap.org. The May 2014 issue of Nature, detailing the differing methodologies, has now published their respective papers. While the TUM team combined original research (40%) with existing information from thousands of studies (60%), the JHU researchers published proteins identified only through original work, which was carried out by the Institute for Bioinformatics in Bangalore.
The JHU-led study identified proteins coded by 17,294 genes, or roughly 84% of the 20,493 human genes annotated in UniProt — the comprehensive, publicly available source of protein sequence and functional information — as protein coding. This number includes proteins from 2,535 genes that previously exhibited no evidence of coding for proteins. The JHU map was generated entirely by Pandey and his co-authors, who performed mass spectrometry-based analyses using 30 normal human samples comprising 17 adult tissues, 7 fetal tissues and 6 purified primary haematopoietic cells using either a Thermo Fisher Scientific LTQ Orbitrap Velos or an Orbitrap Elite.
The TUM-led project itemizes proteins associated with 18,097 human genes — approximately 88% of the protein-coding genome. It also identifies 19,376 of the 86,771 protein isoforms listed in UniProt (www.uniprot.org). The TUM map consists of 40% data generated by Kuster and his co-authors and 60% data gathered from other studies and repositories. To facilitate data processing, the researchers assessed only those results generated by Orbitrap mass spectrometers.
Of interest were findings by both teams that showed some genes previously thought to code for proteins actually did not. Similarly, some genes thought not to code proteins were associated with peptides. The TUM team postulates the existence of a core proteome of approximately 10,000–12,000 ubiquitously expressed proteins, the primary function of which is the general control and maintenance of cells.
Researchers on both teams were unaware of, and expressed surprise at, each other’s multi-year efforts. Interestingly, it appears that the teams may have uncovered different information. By combining their results, the research teams will likely be able to produce a more accurate, and even more comprehensive, picture of the relationship between human genes and their related proteins.
Post Author: Heather Drugge. Heather has 20 years of experience writing about products and services for both the private and public sectors, including more than 15 years with high-tech and biotech companies. She specializes in science-based writing for B2B technology-based companies.
Call me at 604-868-1309
or skype at heatherdrugge
Jay says
Nice works, but the problem is “accessing the data”.
Kim et al. does not provide the list of peptides or proteins so people only have to find raw or msf data to retrieve this information. More worse, raw or msf files are not converted without the commercial tool that authors used in the paper. How on earth researchers utilise this information? Now we all work for high-throughput data. The website they provide is horrible to see information (seems like they even are not interest in curating their results).
The other paper – Wilhelm – thankfully gives us the list of peptides and proteins identified in each tissue. However, this paper somehow used relatively older tech or experimental design. Surprisingly, redundantly expressed synaptic proteins are not captured in their result.
Heather Drugge says
Hi Jay, Because proteomics is such a rapidly evolving field, the lack of consistency in design and reporting can be frustrating. The advances that different teams make are all contributing to our ability to grapple with a very complex challenge. At Accelerating Science, we follow the efforts of organizations like HUPO, which advocates standardizing experimental factors (https://www.thermofisher.com/blog/proteomics/experimental-design-and-reporting-guidelines-for-proteomics-researchers/)
to facilitate an open dialogue, aid experimental replication and push
proteomics research forward. At this stage, we are really still in the “tornado” where consistency is not as valued as innovation — and maybe that’s best for now.
Heather Drugge says
Sorry, that link below should have directed you here: https://www.thermofisher.com/blog/proteomics/experimental-design-and-reporting-guidelines-for-proteomics-researchers/
Heather Drugge says
HUPO has just published a follow up statement about the two papers in Nature.
http://www.linkedin.com/groups/Announcement-from-Human-Proteome-Organization-148675.S.5884825370288480259?qid=a6e25f8d-79c3-4a3f-8b16-91246f128ac0&goback=%2Egmr_148675