Compound Discoverer Software

From sample to structure; pathway to insight

Transform your small molecule data, whether a small or large dataset, from full-scan and MSn data into known compounds and molecular pathways. Introducing Thermo Scientific Compound Discoverer software, offering a full suite of advanced software tools for known-parent and unknown compound identification. Compound Discoverer software streamlines compound identification, determines real differences between samples and elucidates biological pathways with integrated, powerful software workflows to drive rapid insights from your valuable data.

No matter what your small molecule research application, from metabolomics to biomarker discovery, environmental and food safety, pharma metabolite or impurity identification, extractables and leachables to forensic or clinical toxicology, Compound Discoverer offers a unique approach to small molecule structural identification software.

Contact us   Download free demo   Download brochure

Quick links

Key benefits of Compound Discoverer software
Reduce the number of mouse clicks Know your unknowns Find real differences in your sample sets Understand biological pathways
Take control of your data analysis and processing with custom workflows, flexible visualization, and grouping tools. Share results with customizable reporting, or transfer your results directly to Thermo Scientific TraceFinder software for targeted analyses. Rapidly and confidently identify your unknowns with mass spectral library searching against both the online mzCloud spectral library, in-house Thermo Scientific mzVault spectral libraries, and numerous third-party sources. Quickly find real statistical differences between sample sets. See trends in components across a study or identify the key components of interest between multiple sample groups using interactively linked displays. Perform metabolic flux experiments, view pathways using Thermo Scientific Metabolika, KEGG, and BioCyc databases, and map detected compounds and flux information directly onto pathways.

Strategies for making small-molecule unknowns known

Complete characterization and identification of small molecules is an important task, whether it is for better understanding of how our bodies interact with drugs, tracing the environmental fate of pesticides, developing new compounds, protecting brand reputations, or performing fundamental research. Learn more when you download our eBook.

Download eBook

Creating workflows with Compound Discoverer software

Compound Discoverer software provides an extensive, flexible, and customizable toolkit for processing your data. It includes pre-defined workflow templates, so you can be up and running instantly, or quickly adapt a template into a processing workflow designed specifically for your experiment.

Workflow editor and processing flow

Workflow editor and processing flow
 Click to enlarge
The workflow editor is driven by templates that can be assembled into a workflow using drag-and-drop functionality. Your custom workflow then enables comprehensive data processing and visualization based upon your experimental needs.

Study manager

Study manager
 Click to enlarge
Study manager uses a wizard to guide you through the process of setting up a study, including defining sample types and other variables such as multiple time points or sample variants, such as different yeast strains.

Results analysis

Results analysis
 Click to enlarge
The Results analysis interactively links your data and allows you to choose how to visualize it. When selecting samples or components, each view is automatically updated based upon your selection(s).

Annotation and biological interpretation

Annotation and biological interpretation
 Click to enlarge
Whether it is for unknown identification and spectral annotation or flux analysis and interpretation and mapping of biological information, Compound Discoverer software allows you to easily report your results, store your data and share your insights.

Compound Discoverer software benefits from the power of Thermo Scientific Orbitrap-based mass spectrometers, which deliver consistent, accurate, high-resolution data. This data enables the software to align components across samples, determine elemental compositions, make library matches and identify unknowns.

Image of fine isotopic information for davunavir
 Click to enlarge
The consistent mass accuracy and high-resolution spectral data from Orbitrap-based MS systems enables fine isotopic information to be obtained, as shown above for the compound davunavir. The resolution and accuracy provide confidence in elemental composition assignments and subsequent library matching, which can be further confirmed using MS/MS fragmentation information.

Interpreting results and delivering insights

Studies, whether simple or extensive, produce complex data that contains a wealth of information. To get the most valuable insights from that data, Compound Discoverer software offers careful data processing, followed by insightful data reviewing and linking capabilities.

Whether you are conducting single-sample analysis or extensive large-sample studies, Compound Discoverer software provides everything you need for small-molecule unknown data processing, including:

  • Unknown peak detection
  • Advanced statistical tools
  • Interactive data visualization capabilities
  • Compound annotation tools
  • Integrated database and mass spectral libraries
  • Biochemical pathway mapping
  • Untargeted stable isotope labelling analysis
  • Normalization tools for large studies

Regardless of study sample size, each sample contains a wealth of raw data points. Some of those data points are related to one another and many are not. Making sense of this complex, but high-quality MS and MS/MS information, requires data reduction to reach meaningful insights.

Image illustrating how the data points from a file being aligned with detected features
 Click to enlarge
Within each data file there can be many millions of individual data points depending upon the relative complexity of the sample. To confidently obtain results for each sample, and across samples, each data file must have its array of data points aligned with features detected, e.g., a single compound with multiple isotopic peaks may also have numerous adducts. Once the data has been reduced from raw data into features, components can be assembled and identified.

Workflows can be set up using drag-and-drop capabilities, using one of the multiple application-specific templates or editing one of those templates to make data processing quick and easy. Each processing step is accounted for by a ‘node’ within a given workflow tree, which can be connected to drive data processing and interpretation based upon your study requirements; new nodes can be created using a software developer kit or custom scripts like those from R or Python, and subsequently used with the Scripting Node, tailoring workflows to your needs.

Image of a pre-defined workflow template for untargeted metabolomics
 Click to enlarge
An example of a pre-defined workflow template for untargeted metabolomics. This template is designed to find and identify differences between samples. Each node is linked and performs a specific task. Here, retention time alignment is performed before unknown compounds are detected and grouped across all samples within the study. Elemental compositions are predicted using the accurate mass data, with compounds identified using the mzCloud mass spectral library and MS/MS information. Where there is no match from mzCloud, ChemSpider  is used. For results with a ChemSpider match, mzLogic is used to rank results by likelihood of a match. Resulting compounds are then mapped to biological pathways using Metabolika. If QC samples are present, then normalization is performed, and subsequent differential analysis calculated (t-test or ANOVA).

The Compound Discoverer software interface streamlines review of results by showing the information most relevant to the questions being asked; each plot and table is linked so that your view is instantly updated to reflect the compound or sample(s) that you are reviewing.

Image of the default layout for the compound results table
 Click to enlarge
The compound results table default layout where the chromatographic overlay (top left) shows the extracted ion chromatograms for each related adduct, as shown in the selected compound spectra (top right). The results table can be tailored to display the information relevant to your study, such as compound annotation, retention time, peak areas, statistical information, MS2 spectral match, and more.
An example of how Compound Discoverer allows you to open related tables
 Click to enlarge
Compound Discoverer software allows you to open related tables to quickly access the information used to generate annotations. This figure shows the information related to searching the mzCloud mass spectral library (bottom) along with the related experimental and library fragmentation spectra displayed as a mirror plot (top right).
An image of volcano plots from differential analysis
 Click to enlarge
From volcano plots from differential analysis (left), S-Plots from partial least squares discriminant analysis (middle), and hierarchical clustering analysis (right), it is easy to visualize complex data sets and determine what is statistically different using Compound Discoverer software. Each plot is active, so data points selected in the plot can be marked in the results tables and vice versa, helping determine the cause of observed differences or similarities and tracking compounds in complex data sets.

Applications for Compound Discoverer software

Compound Discoverer software can be used for a variety of applications from metabolomics to environmental and food safety and drug development to forensic toxicology.

Metabolomic studies can be very complex, so ensuring acquisition of high-quality, comprehensive data is challenging, as is analyzing that data to gain insights. Ensuring complete sample coverage typically requires extensive manual work to create inclusion and exclusion lists for Data Dependent Acquisition (DDA) experiments.

AcquireX, an automated workflow, allows direct interrogation of all sample components through improved MS/MS sampling with automated background ion exclusion and data acquisition that focuses on true sample components.

A representation of the workflow for AcquireX
 Click to enlarge
AcquireX generates an exclusion list from a blank run (matrix matched). Then, an injection of the sample, followed by feature detection and component assembly, populates the inclusion list with compounds detected in the samples. A series of iterative DDA injections follow. Each injection is informed from the previous one, minimizing redundant fragmentation spectra and maximizing relevant spectra and metabolite annotations.
Bar charts showing improved detection of compounds when DDA is used with AcquireX
 Click to enlarge
Using DDA with AcquireX significantly increases the number of unique compounds with high-quality fragmentation spectra, so you obtain a more comprehensive picture of what is in your samples, as well as increasing the depth of decision-making MS/MS information available.

Combining AcquireX with other enabling tools for Compound Discoverer software dramatically reduces the number of compounds without MS/MS spectra and significantly increases the number of compounds with confident identification and ranked putative identifications.

An illustration of how different enabling tools improve identifications with DDA and AcquireX
 Click to enlarge
Using DDA with AcquireX improves data quality and creates a significant increase in the number of compounds with MS/MS spectra, resulting in improved mzLogic ranking and higher mzCloud similarity scores, ultimately providing higher overall confidence in compound identification and putative unknown identification.

Stable isotope labelling can assist with untargeted metabolomic studies, and Compound Discoverer software provides a range of data review and visualization tools to support this workflow. Compound Discoverer software automatically detects labelled compounds (isotopologues) based on formulas of unlabelled compounds found in reference file(s). Once processed, the exchange rate (or rate of incorporation) can be plotted to see the response across multiple files or overlaid onto Metabolika pathways.

An illustration of the workflow for data visualization of label incorporation and isotope distribution
 Click to enlarge
Stable isotope labelling uses the high-resolution mass spectral data from Orbitrap-based MS, where isotopologues can easily be detected and the respective elemental compositions determined. Compound Discoverer software makes it easy to visualize the amount of label incorporation and resulting isotopic distribution with the ability to map powerful qualitative and quantitative flux analysis information directly onto biological pathways in Metabolika.

Compound Discoverer software includes structurally intelligent dealkylation/diarylation and general metabolism prediction capabilities that allow you to find, identify, and report metabolites of interest. Identification of impurities and degradation products follows similar workflows and relies on a range of software tools and customizable approaches to enable confident detection of related components in complex samples.

Used for structural annotation of fragmentation spectra, Fragment Ion Search (FISh) can localize the site of potential transformations in addition to enabling structural elucidation for unknowns.

A screenshot illustrating how the FISh scoring node enables annotation of fragment structures
 Click to enlarge
The FISh scoring node enables fragment structure annotation and uses the Thermo Scientific HighChem Fragmentation Library for real data from more than 52,000 fragmentation schemes to help localize (bio)transformation. Exact matches are shown in green, with transformation-shifted matches highlighted in blue (above), showing how the site of transformation is identified.
A image illustrating how pattern scoring can help flag compounds that match user-specified isotopic patterns
 Click to enlarge
Pattern scoring allows you to flag compounds that match user-specified natural or artificial isotopic patterns. Including and using additional traces such as UV, PDA, CAD and analog, such as radio label traces, as shown above, ensures that minimal potential metabolites, impurities or degradants are missed.

The Compound Class Scoring node, provides another tool to ensure nothing is missed. It uses a set of representative fragments, created from one or more known molecules in a compound class, to identify other components that could be related or are from the same compound class.

Data illustrating how the compound class scoring node allows you to identify compounds that are structurally related
 Click to enlarge
The Compound Class scoring node allows you to identify compounds that are structurally related, ensuring that nothing is missed, from metabolites to potentially toxic or harmful extractables, leachables or degradants.

Compound Discoverer software reduces the complexity of samples by reducing matrix interferences, as well as targeting specific compound classes through their related mass defects, so you can identify, detect, and review of complex datasets faster.

Data illustrating how MMDF can be used to simplify complex matrix samples
 Click to enlarge
Upper left shows the Total Ion Chromatogram (TIC) for a sample in bile matrix, illustrating the potential complexity and matrix interferences present; bottom left shows the resulting trace following the use of Multiple Mass Defect Filtering (MMDF) and how it can be used to effectively simplify complex matrix samples such as bile, feces, blood, and plasma. The plot on the right demonstrates how the mass defect plot can be used to visualize data and mine using Kendrick formulas, for example unknown polymer identification. All data is interactively linked between plots and data tables within Compound Discoverer software to streamline data review.

Compound Discoverer software can be used to analyze the metabolic fate and structural composition of food impurities and degradation products as well as detect environmental contaminants in soil and water. Once unknown compounds are identified in environmental and food safety studies, they often require high-throughput screening using either quadrupole or high-resolution MS-based techniques. Compound Discoverer software allows you to export your data directly to a new or existing mzVault library or targeted list to be used with Thermo Scientific TraceFinder software for screening and quantitation to reduce the burden of method transfer within your organization.

Compound Discoverer software detects unknown metabolites of drugs of abuse and structurally related designer drugs; for example many new drugs contain similar structures, and the Compound Class Scoring Node can be used to score detected compounds against common fragment ions, therefore aiding the ability to find new drugs based upon characteristic fragments. This information can transferred to screening methods to help you keep up with an ever-expanding array of new drugs and their metabolites. Once unknown compounds have been identified using any of the multiple workflows available within Compound Discoverer software, the data can be exported directly to a new or existing mzVault library, or a targeted list that can be used with Thermo Scientific TraceFinder software for screening and quantitation using either quadrupole or high-resolution MS-based techniques.

Enabling tools for Compound Discoverer software

Several tools come into play when it comes to understanding and interpreting comprehensive data sets. Compound Discoverer software can access numerous online and offline resources, as well as use intelligent algorithms when there is no direct spectral match to help identify an unknown compound.

  • mzCloud, an extensive online advanced mass spectral fragmentation database
  • mzLogic, a data analysis algorithm that combines the millions of available structural databases with the extensive mass spectral fragmentation library of mzCloud to rank order putative structures for unknowns when there is no direct mass spectral match
  • mzVault, a repository that can be used when you do not have online access or need to use your own proprietary libraries. It provides access to the MS/MS-level content from mzCloud, or the ability to create custom, local libraries.
  • Statistical analysis and data normalization tools for uni- and multi-variate statistical analysis

All identified compounds can be linked through these tools, making it easy to select and export data to multiple different sources for use in the next stage of analysis.

An image illustrating how analysis can be streamlined using mzVault spectral libraries or TraceFinder software
 Click to enlarge
You can direct creation or expansion of mzVault spectral libraries or TraceFinder software format lists using text lists, mass lists, inclusion and exclusion lists, enabling you to streamline your analyses and subsequent targeted screening and/or quantitative analysis with minimal effort.

Learn more about the powerful enabling tools available for Compound Discoverer


Covering a wide range of small molecule applications, the extensive structural and chemical diversity of mzCloud, ensures absolute confidence in any unknown identifications.

Making use of exhaustive high-resolution MS/MS and multi-stage MSn spectra, combined with extensive metadata, the worlds largest LC-MSn reference spectral library, and most extensively curated mass spectral library delivers powerful unknown identification capabilities.

Identify more unknowns with MSn and SubTree search

More unknowns can be confidently identified with MSn and substructure spectral matching, utilizing the full power of structure retrieval from online databases or user provided structures.

How was the world's largest mass spectral fragmentation library, mzCloud, created?

The many precursor and MSn fragmentation spectra are logically organized into Spectral Trees for each compound within mzCloud. Each level of a spectral tree symbolizes an MSn stage, where the top level starts at n=1, or the precursor spectra. Each level can contain numerous spectra, as data are acquired using various different experimental conditions to ensure a broad and representative coverage of subsequent fragments, increasing the liklihood of high-quality search results.

A schematic representation of a spectral tree from mzCloud

A schematic representation of a spectral tree from mzCloud. The MS spectra are acquired for a given compound in multiple polarities (ESI +/-), and for a range of adducts. Each precursor is exhaustively fragmented using different fragmentation techniques (CID, HCD) and at multiple collision energies to produce collections of fragmentation spectra at each fragmentation level (MS2, MS3, MS4 etc.), generating a comprehensive spectral tree of information for each library entry.

The extensive data for each library entry is critical for accurate compound identifications, matching experimentally obtained data to that of the library contents, with fit confidence and data visualization provided in the Compound Discoverer and Mass Frontier data analysis software packages. Additional tools include mzLogic, which uses the extensive fragmentation information to confidently identify unknowns that cannot be identified based upon the spectral library compound entries alone.


What happens when you don't get a match from your library search? You can still utilize the comprehensive fragmentation information contained within mzCloud! Through spectral similarity and sub-structural information (precursor ion fingerprinting), mzLogic can take all of this information and provide you with the best candidates for your true unknowns.

When small molecule unknowns don't provide a spectral hit, how can we still identify them?

Maximize your real fragmentation data by combining spectral library similarity searching with chemical database searching.

Create, edit and search reaction pathways with Metabolika. With publication-quality graphical functionality to create and edit reaction pathways, and more than 370 curated and annotated biochemical pathways for a range of organisms included, you can easily share your pathway knowledge.

The information in Metabolika is also used for fragmentation prediction and mzLogic, further increasing the chances of unknown compound identification.

Image of a biological pathway
 Click to enlarge
An example of biological pathway, which can be edited or added to, included in Metabolika.

Additionally, for stable isotope labeling analyses, you can include your exchange rate (or rate of incorporation) in Metabolika to give a more comprehensive view of your pathway.

An image showing the overlay of exchange rate information
 Click to enlarge
Combining stable isotope labelling with the visualization capabilities of Metabolika, allows you to overlay exchange rate information to provide a highly visual way of reviewing and reporting qualitative flux analyses.

In addition to Metabolika, Compound Discoverer software supports both KEGG and BioCyc biological pathway databases. Compound mapping can be shown in two different ways: Context-specific, i.e., looking at a specific compound, you can see what pathways this compound was mapped to, or you can use the global view where you start from the list of pathways and visualize all compounds that were mapped to a given pathway. Detected compounds can be confirmed using mzCloud, for example, with the resulting data color-coded on the embedded pathways.

Your data has inherent value, as it is the knowledge that you acquire. mzVault provides you the capabilities to access and search the MS2-level spectral data from mzCloud off-line, or to store your own spectral library information. Spectral information, and your knowledge, can be automatically sent from Compound Discoverer into a new, or existing library, which can then be searched using Compound Discoverer or TraceFinder software, or edited using Thermo Scientific Mass Frontier software.

Even with extensive online structural databases, and mzLogic to propose a structure or sub-structure, unknowns may sometimes remain unknown. It can be useful to store this information alongside your libraries of previously identified, proprietary compounds, and use it to answer the question, “Have I seen this before?”

For many applications, Compound Discoverer software provides the means to confidently identify unknowns from novel environmental contaminants to designer drugs and metabolites. The next step for some of these applications can be higher throughput identification and/or quantitation using quadrupole or high-resolution MS with TraceFinder software, or further analysis with third party packages.

Good experimental design is critically important for any analysis, especially for statistical studies to ensure that any potential trends observed occur based upon real changes, rather than those which can be attributed to experimental effects. As such, there are protocols for large-scale studies where the use of pooled quality control (QC) samples are utilized to achieve normalization of these large-scale studies.

A comparison of two scatter plots of QC samples
 Click to enlarge
Using pooled QC samples, which are analyzed throughout data acquisition, allows for the correction of batch-effects over time. Correction for each sample is performed individually, the upper plot shows a curve fitted to the QC samples, with the bottom plot showing the resulting data set after correction. This capability is based upon a peer-reviewed methodology published by Dunn et. al. in Nature Protocols . Compound Discoverer software provides the capability to also view the impact of any changes to the data pre- and post- normalization according to this protocol.

An extensive suite of powerful statistical tools within Compound Discoverer software are fully linked to help you understand what compounds/groups of data change and by how much.

Statistical analysis can be used across a range of different analyses from metabolomic, environmental, food safety and adulteration, forensics, clinical, impurities, and extractable and leachable studies. The capability to perform a range of univariate and multivariate analyses from differential analysis, ANOVA, PCA through to PLS-DA, and combining the output from these tools with the results from compound identification through workflows in a highly graphical and interactive way provides deep insights into your data which can easily be reported and shared.

A screenshot illustrating how different views of the data can be connected in Compound Discoverer
 Click to enlarge
Demonstrating the connectivity of data within Compound Discoverer software, the data points highlighted by blue circles in the Volcano Plot (bottom right) are selected within the Compound Table (bottom left). Selecting any compound in any plot automatically updates all plots to show the relevant data. The interconnected tools enable you to rapidly identify differences and the compounds or groups of compounds responsible for those differences. Additionally, it streamlines follow-on confirmation by giving you the ability to filter and review the relevant data.

Compound Discoverer software offers multiple ways to visualize complex data sets and relationships, giving you the ability to add multiple plots across monitors to track and view these relationships and better understand your data.

Data displayed in three different formats, an S plot, hierarchical clustering and a box whisker chart
 Click to enlarge
From Principal Components Analysis (PCA) for unbiased review of data to supervised techniques like Partial Least Squares – Discriminant Analysis (PLS-DA) and the use of S Plots (left) identify compounds that give rise to any observed grouping of samples. Hierarchical Clustering (center) not only shows the clustering of samples along the x-axis, and clustering of compounds on the y-axis, it provides user-configurable heat mapping to visualize any clustering. Box Whisker Charts (right) allow visualization by groupings, time points, and more, with dynamic.

Once complex data sets are thoroughly reviewed, and the components that give rise to differences are evaluated, more substantial analyses may be required in order to verify that the changes/differences are caused by the identified compounds. Checked compounds can easily be exported from Compound Discoverer software to a range of different outputs to facilitate additional analysis. For more information see the “Custom, local libraries and data transfer” section.

Exploring the relationships among your compounds can reveal additional information and insights into your data sets. With Molecular Networks, you can interactively explore relationships between compounds in your analysis based on transformation and spectral similarity, for example a range of Phase I and Phase II transformations.

Relationships between compounds visualized using Molecular Networks
 Click to enlarge
The fully interactive Molecular Networks visualization browser allows you to view your data in a different way. Identified compounds are shown by nodes (circles) and when a relationship is identified, the nodes are connected. Selecting a node (compound) or connection (transformation) displays pertinent information (right) about the identified compound and the relevant transformation(s). All of the visualized data can be interactively filtered using thresholds, data quality information or text search for specific compounds or transformations.

Fragment Ion Search, or FISh, provides fast screening of structurally similar compounds based on the fragmentation pattern of the parent compound acquired either by theoretical fragment prediction or experimental MSn data. The parent compound structure and its potential metabolites are used to filter out the majority of matrix-related background ions, to make identification of relevant compounds quick and easy. FISh provides extensive lists of Phase l and Phase ll biotransformations as well as the ability to build customized lists.

An image illustrating the data output from Fragment Ion Search
 Click to enlarge
In addition to filtering the structurally similar compounds, FISh automatically localizes transformation sites, labels, and applies color-coding to fragments common to the parent and filtered results. In the Mirror Plot example shown here, exact matches to proposed metabolite fragments are shown in green and transformation shifted matches are blue.

With the inclusion of the HighChem Fragmentation Library, which contains information from more than 52,000 fragmentation schemes, 217,000 individual reactions, 256,000 chemical structures and 216,000 decoded mechanisms from peer reviewed literature, FISh is a powerful tool to that helps make structural assignments for putative metabolites, or other potential structures. FISh uses real data to provide greater confidence when proposing fragmentation structures for putative structures and calculates a score to describe how well the fragmentation data can be explained by a given structural candidate.

Delivering Confidence for Small Molecule Identification

In this white paper we address the challenges in small molecule identification with mass spectral libraries. mzCloud spectral libraries and mzVault software are designed to address the challenges of small molecule identification for routine and research applications.

Watch the videos, below, to learn more about the powerful features of Compound Discoverer software from our users and scientists.

Connect your lab to drive more insights from your data

Small molecule characterization and identification clouding your decision making? Cloud-based technologies, including mass spectrometry analysis software, are becoming more prevalent in laboratories. Solve tomorrow's problems today.

Learn how

Ordering guide for Compound Discoverer software

Mass Spectrometry Support Center

One resource for all your support needs related to mass spectrometry instruments and software. Obtain relevant technical information, view tips and tricks when starting an experiment, and/or find answers to some common problems.



Additional resources