The critical role biobanks play as providers of both biological samples and accompanying data for research is unequivocal. This is especially true for samples destined for –omics applications or epidemiological studies. To adequately address these research needs, it is crucial that biobanks maintain the precise biomolecular composition of specimens at the time of collection during both short-term storage and long-term storage. In addition, detailed accompanying documentation regarding potential covariates enhances biospecimen research value, allowing researchers to better interpret resultant data.
Many variables can impact stability, including (but not limited to) storage time, donor age and seasonal variations (i.e., ultraviolet radiation, sunlight and pollen levels). To investigate these covariates, Enroth et al. (2016) measured the abundance levels of 108 plasma proteins collected from 106 Swedish women between 1988 and 2014 (via the Västerbotten Intervention Programme and the Mammography Screening Project), representing 380 separate collection events.1
The research team first assayed 132 proteins for relative abundance, excluding those that fell below the detection limit (n = 14 proteins) or that demonstrated replication over two separate panels (n = 10 proteins). The researchers report significant correlation between the replicated assays, with the highest correlation for C-X-C motif chemokine ligand 11 (CXCL11) and the lowest correlation for Cluster of differentiation 40 protein (CD40).
In terms of storage time, the researchers reported a significant impact on 18 of the 108 assayed proteins. Of these, one protein (Cancer antigen 125, also known as CA-125 and Mucin 16) retained statistical significance after adjustment for testing multiple hypotheses. For these proteins, storage time alone explained 4.9% to 34.9% of observed variance.
Donor age impacted 70 of the 108 assayed proteins. Of these, 45 proteins retained statistical significance after correction for multiple hypotheses, and 16 proteins overlapped with the protein list compiled in the storage time experiment. Donor age alone explained 1.1% to 33.5% of observed variance. The direction of regulation varied; for instance, Interleukin 27 subunit alpha (IL-27A) upregulated with chronological age, while kit ligand (stem cell factor) downregulated with chronological age.
To assess the impact of season on the specimens, the research team grouped the samples by season and compared meteorological data with the protein abundance levels from the assay. They performed six tests per protein, comparing each season with the others. They found significant differences between any two seasons for 15 of the 108 assayed proteins. Of these, 12 differentially expressed proteins associated with winter or summer collection. The meteorological data indicated a significant difference in sunlight hours between these seasons (60 average sunlight hours per month in winter versus 260 average sunlight hours per month during summer). Overall, the researchers report significant correlation between sunlight hours and protein levels for 36 assayed proteins, explaining up to 4.5% of observed variance.
Enroth et al. offer this data as evidence of the impact of three specific variables—storage time, donor age and season—on plasma protein levels in biobanked samples. They indicate that accompanying sample data and epidemiological studies should include natural covariates like those described here.
1. Enroth, S., et al. (2016) “Effects of long-term storage time and original sampling month on biobank plasma protein concentrations,” EBioMedicine, 12 (pp. 309–314), doi: 10.1016/j.ebiom.2016.08.038.