The study of extracellular O-linked glycosylation is restricted by the complex molecular structure of glycopeptides. These molecules contain variable core structures, complex, and diverse elongations. They also lack consensus sequences that inhibit the ability to predict modifications or characterize proteins using traditional mass spectrometry (MS).1,2
Previously, successful characterization of glycopeptides has been achieved using collision-induced disassociation (CID); however, these successes are limited. Dissociation from weak glycosidic bonds leaves behind carbohydrate fragments in CID spectra, and gas-phase deglycosylation of O-linked glycopeptides can sometimes eliminate the carbohydrate completely. The size, building blocks, and the order of carbohydrate units can be determined. In most cases, however, the proteins cannot be accurately characterized without knowing the structure ahead of time.2,3
Electron-transfer dissociation (ETD) and electron capture disassociation (ECD) techniques have proven to be more capable of characterizing intact proteins than CID. The data produced from CID and electron techniques are very different, and to compare data sets obtained using these methods, the search engine scoring must be able to include both types of data and to allow for analysis of peptides produced with a wider range of cleavage specificity.
Darula et al. illustrated how the optimization of database scoring can produce new and insightful information.4 The Darula group compared data acquired from electron based MS/MS with previous studies using CID scoring search engines isolated O-linked glycopeptides containing mucin core 1-type from bovine serum and analyzed these samples. Based on the carbohydrate fragments revealed after LC/MS/MS analysis on an QqTOF instrument (QTOF Premier, Waters), more than 100 glycopeptides were present, yet only a handful of these peptides yielded sufficient CID spectra to be identified by a database search or manual sequencing. The same samples were analyzed using an LTQ-Orbitrap (Thermo Fisher). CID spectra revealed a majority of carbohydrate fragments present and only four glycopeptides produced enough fragmentation to be identified using CID data for database searches.
The Protein Prospector database is a proven tool for analyzing ETD data. A previous study from Darula et al.5 used an earlier version of Protein Prospector to identify 49 glycopeptides after a search in the Swiss-Prot database and manually validating each one.
The Darula group compared performance of Protein Prospector 5.3 to version 5.4. Version 5.3 employed a scoring system based on an older scoring system and used the frequency of occurrence of different ion types in ETD spectra of tryptic peptides. Protein Prospector version 5.4 was developed to better accommodate the products of different cleavage specificities.
Following the upgrade from version 5.3 to version 5.4, the Darula group found a significant improvement in the sensitivity and ability to identify proteins. Protein Prospector version 5.4 of identified 190% more intact glycopeptides, which corresponded to 42 unique glycopeptides, including 16 unique hits compared to only 4 peptides identified by version 5.3.
Version 5.4 also identified 2 novel glycopeptides: (A5D7R6), and VASN protein (A4IFA5), as well as seven novel glycosylation sites, with two sites reported in the human homolog. The novel peptides, and as well as five out of seven glycosylation sites, are currently unidentified in according to the Swiss-Prot database. It is clear that modification of the search engine made the identification of digestion products dramatically more sensitive.
Protein databases, such as Protein Prospector, are being continually upgraded and better tuned. Meanwhile, the Darula group has published another study6 aimed at further improving the isolation and characterization of glycopeptides.
1. Varki, A., et al. (2009) Essentials of Glycobiology. 2nd Edition. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press
2. Peter-Katalinic, J. (2005) ‘Methods in enzymology: O-glycosylation of proteins‘, Methods in Enzymology, 2005 (405), (pp. 139-171)
3. Medzihradszky, K.F., et al. (1996) ‘Structural elucidation of O-linked glycopeptides by high energy collision-induced dissociation‘, Journal of the American Society for Mass Spectrometry, 7 (4), (pp. 319-328)
4. Darula, Z., et al. (2011) ‘Improved Identification of O-linked Glycopeptides from ETD Data With Optimized Scoring for Different Charge States and Cleavage Specificities‘, Amino Acids, 41 (2), (pp. 321-328)
5. Darula, Z. and Medzihradszky, K.F. (2009) ‘Affinity enrichment and characterization of mucin core-1 type glycopeptides from bovine serum‘, Molecular and Cellular Proteomics, 8 (11), (pp. 2515-2526)
6. Darula Z., J. Sherman, and Medzihradszky. K.F. (2012) ‘How to Dig Deeper? Improved Enrichment Methods for Mucin Core-1 Type Glycopeptides‘, Molecular and Cellular Proteomics, published online March 5, 2012. doi: 10.1074/mcp.O111.016774