Thermo Fisher Scientific

  • Categories
    • Advancing Materials
    • Advancing Mining
    • AnalyteGuru
    • Analyzing Metals
    • Ask a Scientist
    • Behind the Bench
    • Biotech at Scale
    • Clinical Conversations
    • Examining Food
    • Identifying Threats
    • Illuminating Semiconductors
    • Life in Atomic Resolution
    • Life in the Lab
    • OEMpowered
    • The Connected Lab
  • About Us
  • Contact
Accelerating ScienceAccelerating Proteomics / Methods / Optimizing Search Engines and Post-Processing Approaches

Optimizing Search Engines and Post-Processing Approaches

Written by Emily Humphreys | Published: 04.27.2016

Abstract image of data collection. Image: DrHitch/Shutterstock.comFinding the best way to analyze proteomic data using data searching and post-processing is crucial to obtaining wide proteome coverage. While there are several different combinations available, it is important to know which combinations interpret data with the highest accuracy. To this end, Tu et al. aspired to determine the best approaches using three search engines (SEQUEST, Mascot and MS Amanda) with five filtering approaches (respective score-based filtering, a group-based approach, local false discovery rate (LFDR), PeptideProphet and Percolator).

To get the necessary data, researchers obtained eight data sets from various proteomes (e.g., E. coli, yeast and human) produced by various instruments. The following table helps summarize the experimental design. 

Model system

Mass spectrometer

Mass spectrometry (MS) parameters

yeast

Thermo Scientific LTQ Orbitrap XL

Collision-induced dissociation (CID) with MS2 analysis in the ion trap (XL CID−IT yeast)

human cell line sample (MCF7 cells)

Thermo Scientific Orbitrap Elite

CID with MS2 analysis in IT (Elite CID−IT human)

human cell line sample (Hela cells)

Thermo Scientific Orbitrap Fusion Tribrid

CID with MS2 analysis in IT (Fusion CID−IT human)

yeast sample

Thermo Scientific Orbitrap Fusion Tribrid

HCD with product-mass-spectra analysis in IT (Fusion HCD−IT yeast)

 E. coli

Agilent 6530A (Q-TOF E. coli)

CID with TOF, mass tolerance of .05 Da

yeast sample

Thermo Scientific LTQ Orbitrap Velos

HCD with product-mass-spectra observation in the orbitrap (Velos HCD−OT yeast)

human cell line sample (Hela cells)

Thermo Scientific Q Exactive

HCD with product-mass-spectra observation in OT (QE HCD−OT human)

human cell line sample (PANC- 1 cells)

Thermo Scientific Orbitrap Fusion Tribrid

HCD with product-mass-spectra analysis in OT (Fusion HCD−OT human

After analyzing the eight data sets using the various mass spectrometry platforms, the team performed database searches and post-processing filtering. After comparing each combination, the team found that data filtered with Percolator outperformed the other four methods. Using the naive score-based approach, improvements by Percolator ranged from 55% to 88%, 44% to 85%, and 14% to 39% at the peptide spectra match (PSM), distinct peptide and protein group levels, respectively, in the eight data sets. For all of the CID−IT data, the group-based approach and PeptideProphet achieved the second- and third-highest numbers in all categories. For all HCD-OT data (data sets F−H), the LFDR and group-based approach achieved the second- and third-highest numbers in all categories. As for the Fusion HCD−IT yeast and Q-TOF E. coli data sets, the group-based approach and LFDR achieved similar improvements, although both were inferior to the results from Percolator.

The team also noted that combinations of SEQUEST−Percolator and MS Amanda−Percolator provided slightly better performances for data sets with low accuracy MS2 (ion trap or IT) and high accuracy MS2 (Orbitrap or TOF), respectively, than did other methods. Looking to uniquely identified proteins, SEQUEST−Percolator achieved the highest percentage of proteins containing ≥4 peptides.

Finally, the team determined that where Percolator was not used, Mascot−LFDR gave more identifications for data sets generated by higher-energy collisional dissociation (HCD) and analyzed in Orbitrap (HCD−OT) and in Orbitrap Fusion (HCD−IT); MS Amanda−Group exceled for the Q-TOF data set and the Orbitrap Velos HCD−OT data set. Taken together, these results are valuable for determining the best method to interpret data.

 

Reference

Tu, C. et al. (2015) “Optimization of search engines and postprocessing approaches to maximize peptide and protein identification for high-resolution mass data,” Journal of Proteome Research, 4(11) (pp. 4662–73), doi: 10.1021/acs.jproteome.5b00536.

Share this article
10
SHARES
FacebookLinkedin

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

The Colonic Metaproteome in Obesity
Tandem Mass Tagging Workflow on the Q Exactive Series

Privacy StatementTerms & ConditionsLocationsSitemap

© 2025 Thermo Fisher Scientific. All Rights Reserved.

Talk to us

Notifications

Get news and research reviews on the topic of your choice, right in your inbox.

Subscribe Now

  • This field is for validation purposes and should be left unchanged.

×
  • Tweet
  • Facebook
  • Tweet
  • Facebook