Toxoplasma gondii and Neospora caninum: Improving proteomic annotations

Toxoplasma gondii and Neospora caninum are troublesome parasites that infect humans and other animals. Since these pathogens can cause significant health concerns, researchers are working to better understand their complex lifecycles. In these two apicomplexan pathogens, there is still much that remains unknown on the proteomic level. Past researchers have achieved an estimated 68% of proteomic coverage for T. gondii.¹ Proteomic coverage for N. caninum is more lacking, with only a handful of published studies, including one article identifying 26 differentially expressed proteins during tachyzoite to bradyzoite differentiation.²

Researchers Krishna and colleagues sought to improve the proteomic annotations for T. gondii and N. caninum.³ The team studied existing data comprising eight experiments using prior tandem mass spectrometry (MS/MS) annotation. They also included official gene models and high-quality RNA-Seq-assisted predictions from the EuPathDB. Finally, the team generated new high-throughput MS/MS data sets for both T. gondii and N. caninum.

To create new data sets for RH strains of T. gondii and LIV strains of N. caninum tachyzoites, the team relied on data from a previously published study. This study used one-dimensional electrophoresis (1-DE), 1-DE of soluble and insoluble fractions (1-DE SFIF), two-dimensional electrophoresis (2-DE) and multidimensional protein identification technology (MudPIT). The team obtained MS/MS spectra using liquid chromatography–tandem mass spectrometry (LC-MS/MS) in an LTQ ion trap mass spectrometer coupled online to a Dionex Ultimate 3000 LC System (both from Thermo Scientific).

In the end, Krishna et al. were able to identify the largest set of proteins with MS/MS. They identified 201,996 and 39,953 peptide-spectrum matches for T. gondii and N. caninum, respectively, at a 1% peptide false discovery rate (FDR) threshold. This equated to the identification of 30,494 distinct peptide sequences and 2,921 proteins (matches to official gene models) for T. gondii, and 8911 peptides and 1273 proteins for N. caninum, at a 1% peptide FDR threshold.

The team also discovered that some potential loci have support from RNA-Seq and proteomics data, but either have incorrect gene models or are missing completely from the current annotation. In total, after applying a 5% FDR, the team found 289 potential loci and 191 proteins for T. gondii, and 140 potential loci and 101 proteins for N. caninum.

Krishna et al. predict that the ability to integrate both genomics and proteomics data will help produce more accurate gene models by pinpointing areas that are in need of revisions.

References

1. Wastling, J. M., et al. (2012) “Parasites, proteomes and systems: has Descartes’ clock run out of time?” Parasitology, 139 (pp.1103–1118).

2. Marugan-Hernandez, V., et al. (2010) “Identification of Neospora caninum proteins regulated during the differentiation process from tachyzoite to bradyzoite stage by DIGE,” Proteomics, 10 (pp. 1740–1750).

3. Krishna, R. (2015) “A large-scale proteogenomics study of apicomplexan pathogens-Toxoplasma gondii and Neospora caninum,” Proteomics, 15(15) (pp. 2618–2628), doi: 10.1002/pmic.201400553. [Epub ahead of print].