The publically accessible Confetti is a newly available multiprotease map of the HeLa human cervical cell line proteome.1 Confetti’s creators, Guo et al. (2014), describe their undertakings in a recent publication.2 The project’s aim was to address coverage issues and limitations imposed by the use of trypsin as sole protease in preparation of samples for bottom-up proteomics analysis. Using multiple proteases, either singly or combined, the researchers found that their tandem or parallel digestion strategies, in conjunction with strong anion exchange (SAX), improved protein sequence coverage. This resulted in better options for identifying selected reaction monitoring (SRM) targets and covering sites of post-translational modifications.
One of the major drawbacks in proteomic research has been that a sole reliance upon trypsin digestion to prepare samples gives relatively limited sequence coverage. Furthermore, because most public databases contain only tryptic peptide sequence information, scientists may not be able to access important regions of interest. In practice, this means missing analytical data on isoform abundance and post-translational modifications, thus compromising development of SRM protocols, since optimal targets for assaying may not be available.
Guo et al. took HeLa cell lysates and treated them with seven commercially available endopeptidases (trypsin, chymotrypsin, elastase, Glu-C, Lys-C, Asp-N and Arg-C), either solely or sequentially. The researchers determined the order in which to use the enzymes by examining their degrees of specificity and the length of fragments generated, running those with specific activity generating longer peptides ahead of less-specific proteases creating shorter peptide fragments.
The researchers analyzed the sample digests via liquid chromatography–tandem mass spectrometry (LC-MS/MS) using Q Exactive hybrid quadrupole-Orbitrap and Orbitrap Elite hybrid ion trap-Orbitrap (Thermo Scientific) mass spectrometers. They compared the resulting spectral data against the UniProtKB human complete proteome sequence database and examined the data for enzyme-specific cleavage sites. They then used a Proteome Amino Acid Coverage (PAAC) metric to assess the extent of sequence coverage and estimate digest complementarity. Using this approach, Guo and co-workers identified 5,223 proteins from a total of 48 digests (7 = single-enzyme digests; 21 = double-enzyme; 20 = triple-enzyme), representing 42% mean sequence coverage.
From this coverage and the digest data, the team selected the strongest complementary digests for SAX fractionation prior to LC-MS/MS. Once analyzed, the researchers could identify 8,470 proteins with mean sequence coverage of 40.3%. When these results were combined with those from the unfractionated digests, Guo et al. reported 8,539 proteins identified, with final mean sequence coverage of 44.7%.
After consuming many hours of valuable computing time in their institution, the scientists imported their final set of 7,774,832 peptide spectrum matches into the new Confetti MySQL database. By applying algorithms and filtering, the researchers created a Web application that gives users access to the protein and peptide identifications in the database, plus information on sequence coverage according to the enzyme used.
To test their new tool, the team used the Confetti website to successfully select suitable SRM assay target AspN peptides to compare with tryptic targets. The researchers first selected proteins digested by both enzymes but showing better results using AspN. They analyzed the cell lysate digests in the presence of heavy standards in an SRM assay and found that, for some proteins, AspN digest results were better than those obtained by trypsin, with increased peak areas measured.
In conclusion, Guo et al. believe that their Confetti Web tool offers researchers an enhanced resource for experimental design, via the selection of enzymes that optimally cover a protein’s region of interest. They feel that this resource is valuable in researching post-translational modifications and developing new SRM assays.
Note and Reference
1. The dataset is available via PRIDE at http://www.ebi.ac.uk/pride/archive/projects/PXD000900 and ProteomeXchange at http://proteomecentral.proteomexchange.org/cgi/GetDataset?ID=PXD000900.
2. Guo, X., et al. (2014, June) “Confetti: A Multiprotease Map of the HeLa Proteome for Comprehensive Proteomics,” Molecular and Cellular Proteomics, 13(6) (pp. 1573–84), doi: 10.1074/mcp.M113.035170.
Post Author: Amanda Maxwell. Mixed media artist; blogger and social media communicator; clinical scientist and writer.
A digital space explorer, engaging readers by translating complex theories and subjects creatively into everyday language.