At the Center for Functional Genomics, University at Albany, State University of New York is on a mission to empower scientists and researchers around the world to achieve the best results from their experiments. With a commitment to excellence and a team of dedicated professionals, CFG has become a pivotal resource for bioscience researchers from various sectors, including universities, government agencies, and private companies.
In this scientist spotlight article, we delve into the experiences of key personnel at CFG, Sridar Chittur and Andrew Hayden, who share their unique journeys, the diverse range of projects they handle, and their insights into the relevance of Sanger sequencing in today’s scientific landscape. Additionally, they offer valuable insights into their exploration of Smart Deep Basecaller and the invaluable lessons they’ve gleaned from this exploration.
Sridar Chittur and Andrew Hayden’s Remarkable Journeys
Sridar Chittur and Andrew Hayden have both embarked on extraordinary journeys in the field of science. Chittur, originally trained as a pharmacist in India, transitioned to biochemistry and molecular biology after realizing the limitations of the pharmaceutical industry. He gained extensive experience in diverse scientific projects during his academic and professional career, ultimately finding a home at CFG, where he has been at the forefront of over two decades of the evolution of genomics technologies. Chittur emphasizes the enduring significance of Sanger sequencing, particularly in projects demanding precision and accuracy.
Similarly, Andrew Hayden’s path to scientific expertise was unconventional, beginning with careers as a carpenter and cook. However, a college program ignited his passion for molecular biology, leading him to CFG through an internship at the University at Albany. Hayden’s remarkable ability to adapt and excel in various roles has made him an invaluable asset at CFG, primarily specializing in Sanger sequencing for tasks such as confirming plasmid constructs and supporting graduate students in completing their theses.
About the Center for Functional Genomics’s work
CFG engages in a wide range of projects, spanning from routine tasks to highly specialized endeavors. A key focus for CFG is to aid researchers in tackling custom projects with unique challenges, such as optimizing Sanger sequencing methods for graduate students analyzing long noncoding RNAs. Another notable collaboration involved creating a custom fragment analysis assay to detect triploidy in agricultural fish species by adapting existing protocols and optimizing methods using the 3730 DNA Analyzer.
Introducing Smart Deep Basecaller, the latest innovation for Sanger sequencing
The introduction of the Smart Deep Basecaller (SDB) has proven invaluable in resolving complex sequences, simplifying confirmation processes plagued by secondary structures and dye blobs. SDB also provides increased read lengths, and more accurate pure and mixed basecalls.
What sets SDB apart is its integration of cutting-edge artificial intelligence and machine learning capabilities, which revolutionize the field of DNA sequencing analysis. Unlike traditional methods, SDB harnesses the power of Deep Learning and AI/ML algorithms to extract more accurate and reliable genetic information from sequencing data. It navigates the intricacies of complex sequences, adeptly addresses challenges posed by secondary structures and dye anomalies and optimizes the analysis of diverse DNA fragments.
Beta testing Smart Deep Basecaller
Smart Deep Basecaller was validated by direct comparison with the standard KB basecaller software under a battery of tests to evaluate read length, accuracy, and basecalling within difficult regions.
Hayden initiated the study by setting up a control plate with ABI sequencing standards (ABI pGEM 3Zf+ plasmid) – 96 replicate wells. Large 5’ and 3’ RACE cDNA products with substantial secondary structure and repetitive sequences (lncRNAs (1-7kb)) and Short PCR amplicons (~100bp) were used.
For added confidence in the results, he conducted the PGEM 3 ZF plus experiment twice, ensuring technical replicates. Subsequently, he employed KB as his initial basecaller for every well and later repeated the same process using the Smart Deep Basecaller. The key focus of the analysis was to compare the average and median number of bases called, as well as the quality scores across all samples. Remarkably, SDB consistently provided better data quality than KB. It resolved basecalls within dye blobs improving basecalling at the extreme 5’ and 3’ ends of PCR products. This results in longer, high quality sequenced for plasmid constracts.
Figure 1. In this diagram the 3’ end of the electropherogram for the same sample is shown using KB and SDB. Figure 1A (top) shows the KB result. Figure 1B (bottom) shows the SDB results with higher quality QV scores at the 3′ end, resulting in longer CRL.
The role of Smart Deep Basecaller
SDB has emerged as a game-changer for CFG, enhancing the accuracy and efficiency of their sequencing projects. Chittur emphasizes that SDB’s robustness is particularly valuable when dealing with homopolymers, simplifying the analysis of complex sequences. It has proven crucial in resolving sequences with dye blobs caused by secondary structures, significantly reducing manual annotation efforts.
Hayden highlights its effectiveness in handling very short PCR products, especially in calling the five prime and three prime ends of sequences, even for challenging samples like PCR products from different sugar cane varieties. Furthermore, it is crucial to emphasize the pivotal role that SDB plays in data recovery. In the case of these sugar cane variety PCR product samples, their value cannot be overstated, as there were no additional samples at hand. Initially, KB encountered substantial difficulties in basecalling when analyzing these short PCR fragments. However, by reanalyzing the sample at hand using SDB, an impressive 95% of the sequence was successfully captured.
Hayden also discussed SDB’s efficiency, when he was assisting a student with their PhD project involving the analysis of long noncoding RNA. They encountered a persistent issue of dye blobs in their sequencing data, despite employing various purification methods and using high-purity starting materials. This dye retention problem significantly disrupted the sequencing process, necessitating painstaking manual annotation of each sequence, which was both time-consuming and inefficient. Fortunately, towards the project’s conclusion, the Smart Deep Basecaller solution emerged as a valuable tool. According to Hayden, having the ability to validate visually observed results with a different basecalling method proved to be a time and cost-saving breakthrough.
Figure 2. In this diagram results in dye blob region using KB and SDB are shown. The KB results are shown in Figure 2A (top). The SDB results are shown in Figure 2B (bottom), with higher QVs in the dye blob region compared to the KB results.
How CFG and SDB empower scientific advancements
In conclusion, CFG is dedicated to supporting scientists and researchers worldwide in achieving their research goals. Through the commitment to excellence and the contributions of individuals like Sridar Chittur and Andrew Hayden, CFG has become a crucial resource for the bioscience community.
SDB has emerged as a game-changer for CFG, enhancing accuracy and efficiency in sequencing projects. It excels in handling various challenges, from homopolymers to resolving sequences with dye blobs caused by secondary structures. Its role in data recovery, particularly for irreplaceable samples, cannot be overstated.
Beta testing of Smart Deep Basecaller demonstrated its consistent superiority over traditional basecalling methods, yielding more bases resolved and higher quality metrics. CFG’s journey with Sanger sequencing and the integration of innovative tools like SDB showcases their dedication to advancing scientific research and empowering researchers worldwide.