Maximizing the Value of Longitudinal Studies, Part II

Posted by Debra A. Barnes, Ph.D. on Apr 8, 2014 12:30:00 PM

clinical trial sample managementClick here if you did not read part I of this blog post. I discussed the need for epidemiological studies to include biospecimen collection, to help make possible the search for biomarkers and development of new diagnostics. In part II, I would like to discuss the opposite problem: biospecimens that lack associated epidemiological data.

Here is food for thought: Diamandis1 noted in a 2010 article in the Journal of the National Cancer Institute that no new cancer biomarkers have been approved for clinical use for 25 years. The discovery of new biomarkers, and the onset of personalized medicine has been very uneven, particularly where cancer is concerned. Type 2 diabetes is another area where progress in biomarker discovery, diagnostics, and new therapies has been slow.

In August 2013, the National Cancer Institute’s Epidemiology and Genomics Research Program (EGRP) sponsored a two-day workshop on making optimal use of existing biospecimens for discovery and/or validation of biomarkers that would allow for early detection of cancer, when the disease is most treatable. One major problem highlighted during the workshop was that many existing biospecimens, even high-quality biospecimens, are less than ideal for biomarker discovery or validation. This is often because of the lack or poor quality of data resulting from inadequate study design, such as differences between cases and controls.

Biomarker discovery requires more than high quality samples: without proper clinical annotation including:  clinical outcome, treatment regimen,  and other data from the sample donors, many biomarkers cannot be identified or may be missed. Particularly in oncology, the validation of new biomarkers often requires retrospective longitudinal case-control studies. This allows comparison of the targeted biomarkers in samples from patients with matched controls, and use of statistical tools (such as receiver-operating characteristic curves) to determine critical relationships such as, the time span before diagnosis, the detection rate, and the false positive rate. In these instances, a well annotated collection of biorepository samples are a goldmine of discoveries waiting to be uncovered.

For instance, the Cancer Genome Atlas (TCGA) project has focused on anonymous tumor samples with no available epidemiological exposure, or related information that can be used to compare or correlate with the biological samples. On the other hand, the TCGA project has also noted that DNA/ RNA sequencing can be done on formalin-fixed, paraffin-embedded (FFPE) tumor tissue, which means that among existing samples, many may have the needed epidemiological data available and can be used in new avenues of research.

The NCI and other institutes within the NIH have sponsored numerous cohort studies and clinical trials that included high-quality, well-annotated specimens, including serial blood draws and collection of outcome data over time. These studies have been used to assess biomarker validity; however, much larger numbers and disease-free populations are needed for further studies. Given the expense and time needed for collection of data and blood samples, a better alternative would be revisiting existing studies and leveraging other currently existing sources of cohort data.

laboratory processing

How do we promote data sharing? Data sharing is recommended across the board, and has been the focus of multiple high-level forums, including the NCI's Epidemiology and Genomics Research Program (EGRP) 2013 workshop titled, Utilizing Existing Clinical and Population Biospecimen Resources for Discovery or Validation of Markers for Early Cancer Detection, the EGRP’s 2012 workshop Trends in 21st Century Epidemiology: From Scientific Discoveries to Population Health Impact, the National Academy of Science in its 2011 report, Toward Precision Medicine, and the 2010 symposia by the NIH entitled, New Models for Large Prospective Studies.

The latter publication noted that “Despite a massive increase in the amount of genomic and molecular information available over the past decade, the number of effective new therapies developed each year has remained stable. “ The recent innovations in genetics, molecular biology, and information technology have not yet significantly changed the practice of medicinal development. We need to find institutional, regulatory, and business mechanisms for the sharing and use of data and specimens available. 

A summary of the 2010 NIH symposium is available at Meeting Report.pdf; a summary of the 2012 Workshop on the future of Epidemiology is available at A summary of the 2013 workshop on use of existing biospecimens is not yet available, but preliminary information is posted at

There is an increasing importance on biomarker research to ensure that samples are handled in the right way for downstream analysis. Click below to download our illustrative guide on which various sample storage temperatures conditions should be optimized based on the end use.

Temperature Storage Poster Thumbnail Final 121813 resized 600















Download InfoPoster