Compound Discoverer Software

Transform data into actionable insights

Transform your small molecule data, whether a small or large dataset, from liquid chromatography (LC), gas chromatography (GC), ion chromatography (IC), full-scan accurate mass, or MSn data, into actionable insights with Thermo Scientific Compound Discoverer Software. The software offers a fully integrated suite of advanced tools for both known-parent and unknown data processing and interpretation.

This powerful software streamlines compound identification, comparative analyses, and provides extensive filtering and data visualization capabilities. With its easy-to-use workflows, integrated libraries, databases, and robust statistical analysis tools, it enables rapid and precise insights from your valuable data.

Whether your research focuses on metabolomics to stable isotope labeling, environmental and food safety, pharma metabolite or impurity identification, extractables and leachables to forensic or clinical toxicology, Compound Discoverer Software offers an exceptional toolbox to drive precise and efficient data analysis in small-molecule research.

When connected to the Thermo Scientific Ardia Platform, you gain direct access to raw data and result file uploads. Eliminate data management barriers in your laboratory and allow users to collaborate remotely from anywhere.

Turn data into knowledge

Learn more

Key benefits

Reduce the number of mouse clicks: Take control of your data analysis and processing with custom workflows, flexible visualization, and grouping tools. Share results with customizable reporting or transfer your results directly to Thermo Scientific Chromeleon Software or Thermo Scientific TraceFinder Software for targeted analyses.

Know your unknowns: Rapidly and confidently identify your unknowns with mass spectral library searching against both the online mzCloud Spectral Library, in-house Thermo Scientific mzVault Spectral Libraries, and numerous built-in annotation tools.

Find real differences in your sample sets: Quickly find significant statistical differences between sample sets. See trends in compounds across a study or identify the key compounds of interest between multiple sample groups using interactively linked displays, including volcano plots, PCA, PLS-DA, and hierarchical clustering.

Understand biological pathways: View pathways using Thermo Scientific Metabolika and BioCyc Databases, and map detected compounds and associated information directly onto pathways. Perform fully untargeted stable isotope labeling experiments and map data onto pathways.

Customizable data analysis workflows

Discover the power of customizable data analysis workflows with Compound Discoverer Software, designed to simplify and enhance your data processing. Tailor each step to suit your specific research needs, helping to ensure efficient and accurate results.

Workflow editor and processing flow

Click image to enlarge

The workflow editor includes pre-built templates containing nodes with flexible data processing parameters and smart drag-and-drop functionality. Design and save advanced built-for-purpose custom workflows using the extensive capabilities to streamline your data processing.

Study manager

Click image to enlarge

Study manager uses a wizard to guide you through the process of setting up a study, including defining sample types and other variables such as multiple time points or sample variants, such as different yeast strains to drive experimental relevance in your interpretation of results.

Learn more about Compound Discoverer Software

The right combination of data and software makes all the difference in accurately interpreting complex samples. Expand the menu to learn about the key features in our Compound Discoverer application.

Studies, whether simple or extensive, produce complex data that contains a wealth of information. To get the most valuable insights from that data, Compound Discoverer Software offers careful data processing, followed by insightful data visualization to speed up reviewing and linking capabilities. With a layered compound tabular view linked directly to a wide array of data visualization tools, you can easily navigate from high level to detailed information on each compound.

Click image to enlarge

The software presents complex sample set results as easy-to-navigate compound tables, with all detailed information linked in related tables. These customizable tables allow personalized layouts to be quickly created, saved, and applied. Chromatographic, MS1, and MSn data are easy to review along with results from various search and identification tools.

Click image to enlarge

Loadings and variance plots are also available and, as with all plots displaying individual compound data, the loadings plot is linked to all other tables and views to allow for quick navigation and compound selection.

Click image to enlarge

Data visualization can be a powerful tool to find compounds of interest. Here a custom plot is used to display the results of a PFAS workflow calculating the mass defect to carbon ratio versus the mass to carbon number, leveraging the zero-mass defect value of fluorine, to highlight potential PFAS compounds. Custom visualization and plotting in Compound Discoverer Software can be used in many ways to allow you to focus on compounds of interest.

Click image to enlarge

Custom plotting and graphing tools can be used to query and evaluate the results of your analysis enabling the assessment of chromatographic method performance, instrument reproducibility, or method suitability. Virtually any aspect of your results can be quickly plotted and graphed to allow queries such as how frequently the desired adduct precursor was selected and used for fragmentation or how the area of a select group of key compounds changed between two sample matrices.

Good experimental design is critically important for any analysis, especially for statistical studies to ensure that any potential trends observed occur based upon real changes, rather than those that can be attributed to experimental effects. As such, there are protocols for large-scale studies where the use of pooled quality control (QC) samples are utilized to achieve normalization of these large-scale studies.

Click image to enlarge

Using pooled QC samples, which are analyzed throughout data acquisition, allows for the correction of batch-effects over time. Correction for each compound is performed individually, and multiple methods are available. The displayed correction plot utilizes SERRF QC (Fan et. al). In addition, QC-based correction per compound can be performed based upon a peer-reviewed methodology published by Dunn et. al. in Nature Protocols. Compound Discoverer Software provides the capability to view the impact of any changes to the data pre- and post- correction. In addition, it also supports several general signal normalization tools.

Click image to enlarge

Internal standards can be a useful tool to assess chromatographic performance across a large analytical batch. Compound Discoverer Software supports the definition of custom internal standard QC sets and extracts the data for quick visualization of performance across the run to enable an assessment of performance quickly. Assess the reproducibility of retention time, mass accuracy, area, or many other critical parameters across the batch with simple to use displays.

Quality data is the key for discovering the real changes in your data. The software offers a wide range of visualization and data curation tools to help assure the quality of your data, and consequently the quality of your conclusions.

Click image to enlarge

Demonstrating the connectivity, the data points highlighted by blue circles in the volcano plot (bottom right) are selected within the compound table (bottom left). Selecting any compound in any plot automatically updates all plots to show the relevant data. The interconnected tools enable you to rapidly identify differences and the compounds or groups of compounds responsible for those differences. Apply filters based on any observable criteria from CSV to fold-change to checked status to quickly filter through complex data.

Click image to enlarge

From volcano plots from differential analysis (left), S-plots from partial least squares discriminant analysis (middle), and hierarchical clustering analysis (right), it is easy to visualize complex data sets and determine what is statistically different. Each plot is active, so data points selected in the plot can be marked in the results tables and vice versa, helping determine the cause of observed differences or similarities and tracking compounds in complex data sets.

Compound Discoverer Software offers a wide range of tools to help with the identification of unknowns. From online and offline spectral libraries, customer user libraries, and compound databases as well as tools for ranking putative identifications like mzLogic and FISh (fragment ion search), the software can help turn more unknowns into knowns.

Click image to enlarge

The software leverages both online and offline library tools for comprehensive data analysis. Online, the mzCloud Spectral Library hosts millions of high-quality curated HRAM MSn spectra on tens of thousands of compounds. Offline libraries, available in the mzVault application format, can assist with PFAS analysis, extractables and leachables, and more. In addition, a compatible version of the NIST Tandem MS library is also available.

Click image to enlarge

The integration of LipidSearch into Compound Discoverer Software allows for the detection and annotation of more potential lipids. A database of lipid classes including 96 subclasses along with predicted fragmentation provides a broad range of coverage as well as annotations for the fragmentation spectra of detected lipids. The lipid grade and ID scoring from LipidSearch are included with results giving more information to make decision during data review.

Click image to enlarge

Chemical database searches can yield on overwhelming number of possible candidates. Compound Discoverer Software offers additional tools to help to refine and rank candidate lists, including mzLogic. By leveraging real fragmentation data from mzCloud, mzLogic analyzes common fragment ions and substructures in unknown spectra, matching them with database candidates. This powerful feature effectively filters out less likely candidates, allowing you to focus only on the most relevant. Here, for an unknown with 53 possible candidates, mzLogic quickly helps to focus on the most relevant.

Click image to enlarge

Another tool offered for helping to reduce the complexity of multiple candidate proposals, either from a database search or from an assessment of possible metabolism or chemical degradation, is FISh scoring. Here, the list of possible candidate structures is used to generate in silico fragmentation data with a score given to each candidate based on its ability to annotate the unknown compounds’ fragmentation spectra. This helps you to separate the most likely candidates from among all possibilities.

Connectivity between compounds in a complex data set can also be a valuable tool for discovering compounds of interest. Whether by mapping compounds and associated statistical variations against a biological pathway or by connecting related compounds in dynamic molecular networks, Compound Discoverer Software offers a wealth of further identification options.

Click image to enlarge

Mapping observed data onto biological pathways can also provide useful interpretive information, especially when combined with experimental information such as fold-change or stable isotope label incorporation. Internal tools like Metabolika facilitate mapping onto provided or user custom biological pathways. In addition, information can be mapped onto BioCyc (user subscription required). Data from the results can be directly mapped onto the pathways. Shown here is the average 13C label incorporation, but multiple data mapping options are available.

Click image to enlarge

The fully interactive Molecular Networks visualization browser allows you to view your data in a different way. Identified compounds are shown by nodes (circles) and when a relationship is identified, the nodes are connected. Selecting a node (compound) or connection (transformation) displays pertinent information (right) about the identified compound and the relevant transformation(s). All the visualized data can be interactively filtered using thresholds, data quality information or text search for specific compounds or transformations.

With thousands of persistent PFAS chemicals and transformation products impacting the environment, understanding their true extent requires more than traditional targeted analysis. Compound Discoverer software enables scientists to uncover the full scope of PFAS contamination, helping to assess potential environmental and health risks with greater confidence.

Click image to enlarge

Chemical analysis of PFAS can be a challenge given the complexity of real environmental and treatment samples. New visualization tools, such as the PFAS plot, help to determine potential suspect compounds by leveraging the unique nature of fluorine and heavily fluorinated compounds. By plotting their mass defect to carbon ratio against their mass to carbon ratio, compounds of potential PFAS interest are separated from the bulk of normal organic molecules. When combined with a PFAS score calculated from multiple diagnostic points, this approach further refines and prioritizes high-interest targets.

Click image to enlarge

Even more tools are available for the analysis of PFAS contaminants. Given their often polymeric nature, PFAS compounds undergo a common series of losses such as a single CF2 group. These internal neutral losses can quickly be detected and along with visualization tools help in the determination of compounds of interest. Additional scoring based on common PFAS fragments (class-based fragmentation) and annotation using extensive built-in PFAS databases further refines the analysis. All of this combined helps to focus on potential PFAS contaminants and gain more confidence in their assignment.

Explore the advanced capabilities and tools within the software that offer researchers enhanced flexibility and precision in their data analysis workflows.

Click image to enlarge

Stable isotope flux experiments are a powerful method to assess the biological impact of various changes. The software supports fully untargeted stable isotope label experiments, helping to uncover unknown or unexpected alterations. Average and relative exchange is calculated as well as total label number incorporated on a per compound basis — all of which can be mapped onto biological pathways to enhance understanding of the changes.

Click image to enlarge

Isotope label incorporation can also be visualized per compound to see how much label has been incorporated and how extensive the labeling is for a given compound across the experiment. Here the incorporation of 13C label into phenylalanine is nearly complete over the course of this experiment with all nine carbons labeled in the majority of observed compound.

Scripting node

Any software can only do so much, but with the capability to use the Compound Discoverer Scripting node, you can do even more. Use the scripting node in your workflow to automatically export any data from your results, launch your own custom R or Python scripts, and pull the results back into your Compound Discoverer results as a part of the data tables.

Node Developer Kit

In addition to the wide range of data processing tools available to create custom data processing workflows, we also offer a Node Developer Kit (NDK), which allows you to write your own entire custom data processing nodes to include in workflows. The ultimate in flexibility and customizability for those with a talent for software development, or with someone on your team who does.

PyEDS toolPyEDS

PyEDS toolPyEDS is a Python library providing a collection of utilities to conveniently access and display results from Compound Discoverer Software. Using PyEDS, all the hard work of navigating the data hierarchy and recovering the desired information is done automatically, so you can focus more on your research instead of how to read the data.

For the analysis of data acquired using Thermo Scientific GC-Orbitrap-based mass spectrometers, there are two primary workflows, enabled using specific workflow nodes such as Electron Impact (EI) and Chemical Ionization (CI) deconvolution nodes. GC-Orbitrap data can be analyzed using the extensive tools within Compound Discoverer Software to enable confident compound identification, or statistical analysis, for example.

Click image to enlarge

Examples of two GC-based workflow trees: The first is an EI workflow that can be used to find biomarkers through statistical analysis and identify unknown compounds via library search, and the second is a CI workflow that can identify unknown compounds of interest through molecular formula determination and structural elucidation of MS/MS spectra.

Click image to enlarge

The above shows GC-EI compound identification in the result view. On the upper right-hand side, a mirror plot shows the deconvoluted spectrum and the library spectrum. Highlighted in the second level table under Library Search Results are total score, delta mass of molecular ion and RI delta: Total score is a composite score that includes contribution from the HRF score and SI score; delta mass is the mass accuracy of the molecular ion if it is present in the deconvoluted spectrum; RI delta is the difference between the library RI and calculated RI. Based on the total score “94.9,” the less than 1 ppm delta mass of the molecular ion, and the RI delta value of one, there is very high confidence in this identification.

Click image to enlarge

A suite of statistical visualization including PCA, PLS-DA, Variance, Loadings, and Volcano Plot are available to discover the real differences between your sample groups within your GC- Orbitrap mass spectrometer data. In addition, identified compounds can be mapped against biological pathways both within Compound Discoverer Software (Metabolika) and against BioCyc, including mapping relevant information such as fold change.