Ardini et al. (2014) share the challenges encountered and solutions engendered in a decade of biobanking for the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK).1 From the inception of the central repository (CR) in 2003 to developments reached by 2013, the authors cover the various aspects encountered in running a large-scale, multi-user biobank for medical research.
RTI International managed the NIDDK data repository from its inception in 2003 until 2013. Initially, the facility developed in response to the NIDDK’s data sharing policy, which requires grant-funding applicants to include specific measures giving other researchers access to the resources generated from the proposed studies. In this way, the NIDDK ensures that other researchers can benefit from resources without having to repeat data collection for samples already in existence. This facility, the CR, acts as an aggregated central resource for the NIDDK and now contains biosamples with associated data from more than 70 major multi-site clinical research projects. Furthermore, the CR is structured so that researchers can pool data across multiple studies and sites for analysis.
The initial concept envisaged by biobanking managers included setting up both public and private domains for security and searchability. In reality, the CR comprises only a private domain administered and accessible solely by repository staff. Data are stored and submitted in Statistical Analysis System (SAS) format and reside in secure archive warehouses as a private network that only staff can search. Researchers must develop proposals for submission, working in tandem with repository staff, for feasibility and eventual acceptance. Once repository staff accept a proposal, they make relevant data and resources available to the applicant. Although information on data types is not available to applicants, the high degree of personalized attention in the form of assistance and involvement from repository staff increases the biobank’s perceived value. This is especially true for researchers at the beginning of their career, who receive levels of guidance from CR staff not usually experienced elsewhere within biobanking. In terms of financial strategy, the biobank benefits from being able to bill for CR staff time.
The CR also maintains two Web portals, one public, for general information and publications listings that is accessible by anyone, and one private, which is only accessible to registered users as an electronic information exchange. The Web portals run on RTI’s Oracle Application Server 10g, v. 10.1.2.0.2, and accept data in SAS format, although older records exist as PDFs or image files. As the resource developed, CR staff created a series of public query tools that allow users to interrogate the available databases, thus enhancing the facility’s value as an educational tool. The biobank works with researchers as it grows and adapts to a changing research landscape, in addition to promoting itself to ensure continued use within the research community.
Ardini et al. identify five main challenges encountered by the biobank over its existence and the steps taken to counteract them:
- Lack of provenance: The CR houses historical study data and therefore has to deal with incomplete or inconsistent sample identification, data recording and consent gathering. Since this also frequently includes a lack of sample preparation and storage information, the value of the bioresource cannot be guaranteed.
- Lack of linkage: In addition to quality issues of historical biosamples, CR staff also find that files linking sample identity to patient identity are not available. To prevent this happening in the future, staff now ensure adequate data recording and sample identification, with linkage to patient files, for all biosamples added to the collection.
- Data ownership: Since data sharing is a required part of study proposals, biorepository staff now take an active approach to ensuring that all researchers make information available in a timely manner.
- Consent issues: Standardized consent formats under development and review will ensure consistent recording in future studies.
- Sustainability: As the CR is a centralized resource, the issue of support is problematic, since no single institution takes on responsibility for funding. Although the bioresource charges for labor costs associated with biosample preparation, storage and retrieval, these do not cover the total expense. However, Ardini et al. consider that maintaining standards and strict quality controls does ensure continued value to the research community.
Ardini et al. consider that despite encountering challenges, biobanking will continue to increase in value to the research community, especially in its ability to centralize resources for areas such as rare disease research. They also think that it will contribute greatly to personalized medicine and note that the National Institutes of Health recommends that biobanks take responsibility for conveying clinically relevant and actionable findings from genomic studies back to donors. In the future, the standard operating model must adapt to changes in research environments and technology, to fulfill requirements placed by large-scale population biobanking and –omics approaches.
1. Ardini, M.A., et al. (2014) “Sample and data sharing: Observations from a central data repository,” Clinical Biochemistry, 47(4–5) (pp. 252–257), doi:10.1016/j.clinbiochem.2013.11.014.