Smart Biobanking: From Samples to Predictive Algorithms for Detecting Cancer

Posted by Julie Barnes, Ph.D. on Jul 24, 2014 9:30:00 AM

The recently released World Cancer Report 2014, a global analysis by the World Health Organization, noted both the increase in cancer cases worldwide, as well as the burden represented by the spiraling cost of treating late stage disease. The report, as stated in the preface, makes it clear that we cannot “treat our way out of cancer.”


The most effective approach to addressing cancer is prevention, and if treatment is necessary, it is best at an early stage, which is associated with much higher survival rates as well as lesser side effects. Ovarian cancer is a case in point: Five-year survival among women diagnosed with stage 4 disease is 5.6 percent, while those diagnosed and treated at stage 1 is as high as 92 percent. However, early treatment depends on early diagnosis, and in spite of major investments of both resources and effort, there are still few biomarkers available for the diagnosis of cancer at a very early stage, when the disease is far more treatable.


This is the story of 20 years of clinical research that has led to the development of a well-validated screening test for ovarian cancer, known as ROCA (the Risk of Ovarian Cancer Algorithm) that successfully identifies early stage disease. Just as significant, this is the story of the establishment of a unique biobank of serum that is now being used to discover and validate additional biomarkers for screening and early diagnosis of other cancers.

The Beginning: Understanding CA-125 in Healthy Women
Carbohydrate Antigen 125 (CA-125) is a well-established biomarker traditionally used to aid diagnosis in women presenting to their doctor with symptoms suggestive of ovarian cancer. In such a setting, levels of CA-125 above an accepted threshold of 30 or 35U/ml have been considered abnormal and a flag to a clinician to do further investigations. However, it is now well understood that the use of CA-125 in a simple quantitative way is limited by its low specificity. Elevated CA-125 levels above 35U/ml not only occur in people with cancer, but can also be found in people with no health problems, as well as in patients with relatively benign conditions, such as endometriosis. Thus, relying on elevated CA-125 results in hundreds of false positives for every one individual accurately diagnosed with cancer.
For many years, groups around the world have investigated the possibility that CA-125 could be used for screening healthy women for early detection. One of these investigators, Professor Ian Jacobs, then a Research Fellow at the Department of Obstetrics and Gynecology at Cambridge University, set out in the early 1990s to determine whether or not regular testing for CA-125 could be used more specifically in early diagnosis, by assessing change over time and also by combining it with other potential tests. In his first UK study, he recruited a cohort of 22,000 healthy women and assessed the ability of CA-125 when combined with trans-vaginal ultrasound (TVU) as an annual multi-modal approach to detecting early ovarian cancer. This study revealed that such combination screening could identify ovarian cancer with a high specificity of 99.8 percent, thus reducing the number of false-positive surgical procedures performed to only five. A second study of 5,550 women in Sweden yielded similar results, reducing the number of false-positive surgeries to two. Most significantly, later analyses of the data showed that the survival rate among women diagnosed earlier using both CA-125 and TVU increased from 41.8 months to 72.9 months.

The Meeting of Minds: Clinical Insight and Statistical Power
statistical_powerIn working with this clinical data set to consider how to improve sensitivity, another perhaps more fortuitous result revealed itself. Jacobs was working with a biostatistician affiliated with Harvard Medical School and Massachusetts General Hospital (MGH) in Boston, Steven J. Skates. Dr. Skates noticed patterns in the data, often referred to as “subject-specific temporal behavior.” That is, women without ovarian cancer had a flat CA-125 profile, consisting of a baseline level individual to each woman around which her CA-125 levels fluctuated to a minor extent. This baseline level could be well above or well below 35U/ml—the critical issue was that this baseline did not change significantly over time. In contrast, women who later developed ovarian cancer showed a sharp increase in CA-125 values from her original baseline (regardless of whether the initial baseline value was high or low) that significantly exceeded normal background fluctuations. That is, all the 22,000 women studied had very individual CA-125 levels, both above and below 35U/ml. However, women who eventually developed ovarian cancer showed changes in CA-125 level that rose rapidly from the baseline value. This leap in CA-125 is called a changepoint CA-125 profile. More significantly, this characteristic pattern of change in CA-125 levels occurred early in the course of the disease.

The Invention of the Risk of Ovarian Cancer Algorithm (ROCA)
Predictive_Algorithm_with_CA125_profilingBased on the patterns observed over time, Skates proposed using CA-125 levels in a different way. Rather than looking at a single threshold level, he proposed calculating a patient’s risk of cancer based on a series of CA-125 measurements and their fluctuations. Skates applied statistical modeling to the data, creating separate profiles of CA-125 levels among the women who developed ovarian cancer and those who did not. The analyses resulted in a hierarchical change-point model that became known as ROCA, the Risk of Ovarian Cancer Algorithm. 
ROCA compares a woman’s individual blood profile of CA-125 over timewith the longitudinal CA-125 profiles of thousands of other women who developed ovarian cancer. The probability of ovarian cancer increases the closer a woman’s profile is to the change-point profiles seen in women with ovarian cancer, compared with the flat profiles for healthy women. After every additional CA-125 measurement, a woman’s ROCA score is recalculated, and clinical recommendations provided (i.e., no action needed, screening at shorter intervals, or referral for TVU). This systematic method has proven both more accurate and more efficient than using a single cut-off level of CA-125 because it maximizes sensitivity for any level of specificity. ROCA both “distributes” screening in a stepwise fashion according to probability of disease and also identifies women who may have developed ovarian cancer at a very early stage of the disease.
One other critical consideration in developing your next biomarker is the integrity of your samples used in the studies. No matter if it’s as simple as the pH of a serum sample or as specific as the quaternary structure of a protein complex, any biomarker can be affected and potentially compromised by a number of external factors, most easily temperature. This was discussed by Dr. Debra Barnes in her blog post - "Know Your Samples: How Resilient are Your Biomarkers"

Read the rest of the story in the eBook, "Smart Biobanking: From Samples to Predictive Algorithm for Detecting Cancer". You can also click on the download button directly below. 

 Smart Biobanking: From Samples to Predictive Algorithms for Detecting Cancer

 Download eBook