Continuing the goal of identifying every protein corresponding to human genomic data, the Chromosome-Centric Human Proteome Project (C-HPP) recently published a study to identify previously unknown proteins or missing proteins.1 For this study, the researchers chose to focus on proteins found in human testes, since this tissue is known to have high gene expression levels, with 15,000 gene transcripts detected previously.2
The researchers used a combination of proteomics and transcriptomics to survey gene expression in human testis tissue from three post-mortem adult males. The samples originated from the General Hospital of Chinese People’s Armed Police Forces and Capital Medical University Affiliated Beijing You An Hospital.
Researchers washed the tissue samples and froze them at -80C. Next, they ground samples in liquid nitrogen and sonicated them in lysis buffer. To account for any variation between samples, they processed each sample individually, rather than pooling them. The team extracted proteins and used sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) to separate them. They then stained gels with Coomassie Brilliant Blue. The team then cut gel lanes into 28 bands based on molecular weight and performed an in-gel trypsin digest prior to ultra-performance liquid chromatography (UPLC) using a UPLC instrument (Waters) and liquid chromatography and tandem mass spectrometry (LC-MS/MS) on an LTQ Orbitrap Velos mass spectrometer (Thermo Scientific).
To further deepen proteomic coverage, the team also ran the same cell lysates on tricine gels. They then cut each lane into 22 gel bands, digested with trypsin, and analyzed samples using LC-MS/MS. Regular SDS-PAGE and tricine SDS-PAGE yielded 1,178 and 533 proteins uniquely identified, respectively, for a total of 9,597 proteins. The C-HPP researchers compared their results with other proteomic work and found 90% overlap with the proteins identified in previous studies.
The team also evaluated the transcriptome by extracting RNA and sequencing it using the Ion PI Chip v2 on the Ion ProtonTM Sequencer (Thermo Scientific). The team identified 8,959 commonly identified genes in the transcriptome and proteome. Interestingly, there were 7,424 genes (45%) identified from the transcriptome but not the proteome, and 638 genes (7%) were identified from the proteome only, suggesting there are still more proteins and genes left to reconcile. Regardless, the team successfully identified 182 missing proteins belonging to 166 protein groups, which was one of the largest data sets of missing proteins to date.
1. Zhang, Y., et al. (2015) “Tissue-based proteogenomics reveals that human testis endows plentiful missing proteins,” Journal of Proteome Research, 14(9) (pp. 3583–94). doi: 10.1021/acs.jproteome.5b00435.
2. Ramskold, D., et al. (2009) “An abundance of ubiquitously expressed genes revealed by tissue transcriptome sequence data.,” PLoS Computational Biology, 5(12). doi: 10.1371/journal.pcbi.1000598.