New Paper "CryptoGenotyper: A new bioinformatics tool for rapid Cryptosporidium identification"


Abstract - Cryptosporidium is a protozoan parasite that causes the enteric disease, cryptosporidiosis, and is transmitted to both humans and animals. To understand the transmission dynamics of Cryptosporidium, the small subunit (SSU or 18S) rRNA and gp60 genes are commonly studied through PCR analysis and conventional Sanger sequencing. However, analyzing sequence chromatograms manually is both time consuming and prone to human error, especially in the presence of poorly resolved, heterozygous peaks and the absence of a validated database. Here we present a Cryptosporidium genotyping tool, called the CryptoGenotyper, which has the capability to read raw Sanger sequencing data for the two common Cryptosporidium gene targets (SSU rRNA and gp60) and classify the sequence data into standard nomenclature.

Results - We tested and validated the CryptoGenotyper (available at https://usegalaxy.eu/root?tool_id=CryptoGenotyper) on sequencing data from animal, water, and clinical specimens collected from Canada, the United Kingdom, and Sweden. The CryptoGenotyper successfully genotyped 99.3% (428/431) of SSU rRNA chromatograms containing single Cryptosporidium species. The incorporated heterozygous base calling algorithms allowed for 95.1% (154/162) of SSU rRNA chromatograms containing mixed sequences of two different species or variants of the same species to be resolved and classified individually. Furthermore, the CryptoGenotyper correctly subtyped 95.6% (947/991) of gp60 chromatograms.

Conclusion - We have developed the user-friendly tool CryptoGenotyper to help Cryptosporidium research groups analyze their Sanger sequencing results in a fast and reproducible manner. We have shown that this tool has the capability to perform quality control and properly classify sequences using a high quality, manually curated reference database to characterize the genotype and subtype (based on the SSU rRNA or gp60 locus, respectively) in the presence of mixed infections or artifacts in the sequencing itself. Accurately identifying these genotypes and subtypes will provide further insights into transmission dynamics of Cryptosporidium as valuable data are recovered from these Sanger sequencing chromatograms.