Large longitudinal studies have become impractical on many levels. At well over $100 per patient enrolled, the traditional large cohort study has become cost-prohibitive. Nor is cost the only factor: another is time. By their very nature, cohort studies can take decades to yield benefits, both at the individual as well as the population level.1 However, these studies are still needed both for public health research and to support the implementation of personalized medicine.
The research community as a whole has accumulated a wealth of phenotypic, genotypic, and outcome data as well as a substantial inventory of biospecimens that are linked to the data. We need to ensure that we make good use of what already exists, to reduce the time and resources needed to get new therapies into clinical practice.
This is easy enough to state, but very difficult to execute. Many multi-institutional research networks have existed for decades, but are typically decentralized, struggle with internal data silos and harmonization, as well as administrative issues.
More Data, and More Available Data
The good news is that data silos are beginning to crack open, both through innovative technology and through institutional efforts. As only one example, the California Teacher Study (CTS), which is a collaboration of four California institutions (City of Hope, the University of Southern California, the University of California-Irvine, and the Cancer Prevention Institute of California) is using mobile devices and cloud-based technology to eliminate data silos and dramatically cut the time and cost of managing the data associated with the CTS, as well as the time needed for access to and use of the data and associated biospecimens.
The National Institutes of Health, based on lessons learned from the experience of the UK Biobank as well as the National Children’s Study, is beginning to apply a “collect and use what we already have” approach1. This is evident in the recently launched Precision Medicine Initiative, which is taking advantage of existing health care providers for enrollment and data collection, as well as biospecimen collection, in the goal of recruiting a million participants in a short time. Similarly, the Environmental Influences on Child Health Outcomes (ECHO) Study, which partially replaces the National Children’s Study, is seeking the creation of “synthetic” cohorts by encouraging formation of consortia for pooling data.
Sharing of data is rarely simple; particularly when bringing together multiple institutions. When pooling specimen collections it is very important to limit pre-analytical variability to ensure the quality of samples. By incorporating standardization and having well-annotated samples, research made possible by the pooling of data can yield rich associations and avenues for future research. Can we do the same for biospecimen-based research?
The Scarcity Factor
Pooling of biospecimen resources and lowering barriers to access is lagging behind that of data. There is a good reason – data can be endlessly copied and shared, while biospecimens, like archaeological sites, are destroyed in the process of being studied, and thus the issue of scarcity has to be considered. Some biospecimens are rare or even irreplaceable. However, are we erring on the side of caution? Compared with chemical compound research inventories, it can be argued that if anything, biospecimen collections are severely underutilized.
Even among sample collections where scarcity is clearly not an issue, gaining access can require investing significant resources in time and cost, the least of which is creating the online account. Acquisition of samples that are already banked and annotated with data requires:
- submitting a research proposal,
- signing / submitting material transfer agreements,
- submitting agreements about intellectual property rights,
- addressing consent/re-consent, and a host of other requirements, and
- administrative matters.
Without corresponding biospecimens, much of the value of cohort study data is wasted, as there is no molecular data to bridge the gap from statistical association to clinical application. Specimens allow searches for biomarkers, measurement of exposure, intermediate phenotypes, study of the exposome, and other analyses not even imagined at the time the data and samples are collected.
Large cohort studies may be impractical, but they are not yet obsolete. Faster, easier pooling of data allows efficient use of resources for research. Pooling of biospecimen collections will likewise allow biomarker discovery, diagnostics, and new therapies to reach the clinical faster.
We recently interviewed Clive Green, Director and Head of Sample Management at AstraZeneca, about the challenges in the biospecimen supply chain, from both a small molecule and biospecimen perspective. To read more, download our eBook Maximizing the Value of Biospecimens to Deliver New Therapies.
1. Manolio, T. A.; Weis, B. K.; Cowie, C. C.; Hoover, R. N. & Hudson, K. (2012). New Models for Large Prospective Studies: Is There a Better Way? American Journal of Epidemiology, March 12, pp. 1–8. DOI: 10.1093/aje/kwr453.