The field of next-generation sequencing (NGS) has come a long way since the Human Genome Project.The first human genome sequencing process took 13 years to complete with a $1 billion budget.Today, an entire human genome can be sequenced in 1-2 days, costing only a few thousand dollars per genome. Thanks to its low-cost, high-throughput capabilities, NGS has contributed to the rise in population-scale genome projects in recent times. For instance, in 2018, the UK completed its initiative of sequencing 100,000 genomes to improve knowledge on the genetics of cancer, rare diseases and infectious diseases.
Comprehensive data obtained from NGS empowers us to make deeper connections between phenotypes, genotypes and environmental factors affecting diseases. The access to large-scale information has helped advance precision medicine, making individualized treatments possible.
With a steady growth in NGS-based applications, the scientific community now faces a new challenge: copious amounts of NGS data. As thousands of samples get processed in the quest for genetic information, the resulting voluminous files of highly sensitive data need to be managed responsibly. Here, we discuss the information challenges facing genomics laboratories and provide solutions to better manage data without compromising on productivity.
NGS Data: With Great Data Comes Greater Responsibility
Given the invaluable potential of NGS-based techniques in advancing personalized medicine, a large number of pharmaceutical and biotechnology organizations now use sequencing techniques. This increase in information-heavy research brings an urgent need to manage data in a responsible manner. Research teams are often not trained in good data management practices, resulting in overwhelm or confusion when handling a large amount of information. Instances of data integrity infringements, often a result of poor practices, have prompted regulatory authorities to closely scrutinize procedures in pharmaceutical facilities.
Key data-related challenges faced by laboratories performing NGS include:
- Handling and storing large amounts of sequencing data
- Managing complex workflows
- Maintaining user records and the chain of custody
- Facilitating communication across different teams
- Ensuring data is secure
- Avoiding the creation of data silos
The rise in data-heavy research methods has made it necessary for the scientific community to consider robust and reliable data management strategies that facilitate easy data storage, while maintaining integrity and security. Having efficient data management systems also means easier access to relevant information, improved workflows and seamless collaboration.
Manage Information in the Cloud: An Easy Way to Store, Access and Use Data
To address data challenges, scientists are now employing laboratory information management systems (LIMS) as a part of experimental workflows. Rather than having each team member manually save data, risking files being overlooked due to mismatched nomenclature, these information management systems work ‘behind the scenes’ to simplify and streamline the data obtained from large-scale experiments. For example, the Thermo FisherTM Platform for ScienceTM software enables researchers to collect, store, access and share scientific data on the cloud.
The integrated platform automates data acquisition, enables analysis and stores data in an organized, searchable format. When required, scientists can search and retrieve data to be used for further analysis and even make data-driven decisions to design new hypotheses. By incorporating data automation into NGS, laboratories can free up personnel time to focus on other tasks, while maintaining data accuracy and consistency across projects. With a low cost of ownership, LIMS can make teams more productive and workflows more efficient, thereby yielding a positive return on investment.
Using a cloud-based platform brings the added advantage of flexibility. If needed, additional storage can be added without the need to build any infrastructure or buy additional hardware. As such, the flexible and scalable nature of cloud-based informatics makes it easier for organizations to grow their capabilities, expand their operations and take on bigger projects without the need to invest in additional resources or having to worry about IT operations.
Sharing Data Across Teams While Upholding Data Integrity
The increasing number of patient-derived data generated in precision medicine research and the rise in population-level studies further emphasize the need for efficient data management across teams. It also highlights our responsibility to safeguard sensitive and personal information.
In large-scale NGS studies involving multi-disciplinary teams, as each researcher needs to have access to the data pool, having information stored, organized and managed efficiently can facilitate a seamless collaboration. Modern cloud-based informatics platforms store all generated data in one central location, presented in an easily searchable format, allowing scientists to access and review data as required. The software, configured to suit the needs of a laboratory, can complement specific workflows and unique data needs of a project.
When data can be accessed by multiple users, it is also important to ensure that it is not misused. Cloud-based LIMS create an environment of accountability with built-in features to record all user interactions. Authorized personnel can access information at any time to review the process and stay updated on real-time activities. With systems in place to uphold data integrity and enable officials to oversee projects, any unusual or non-compliant practices can be flagged, identified and addressed. From the point of data acquisition to report creation, a set of audit trail features within the software ensure traceability of data, thereby complying with regulatory requirements.
As we are faced with the challenge of storing and handling sensitive data generated from NGS, it is becoming more and more important to implement responsible data management into our practices. Digital solutions, such cloud-based LIMS, enable easy data storage, allow teams to collaborate effortlessly and, at the same time, ensure that data stays secure. Incorporating reliable information management systems can future-proof NGS laboratories as technologies continue to evolve and data-heavy projects continue to rise in genomics.