In July, 2007 Dr. Roland Nardone, Professor Emeritus of the Discovery Center for Cell and Molecular Biology of the Catholic University of America, and 18 of his colleagues penned a powerful letter regarding the topic of misidentification and cross-contamination of cell lines to Michael O. Leavitt, then Secretary of Health and Human Services. The 18 researchers who added their signatures to the letter represented both extensive experience in cell culture and a cross-sample of governmental, corporate, and academic institutions in the US and the UK. In his letter, Nardone informed Secretary Levitt of the extent of misidentification of cell lines being distributed amongst scientists and warned of the “resulting corruption of biomedical research” and “grave public health consequences.” He urged both an educational initiative about misidentified call lines and a specific mechanism by funding agencies to ensure the authentication of cell lines: no authentication, no funding.
Seven years after this initial call for action, many journals now require a confirmation that the cell lines used in research submissions were authenticated. There are, however, still no mandates requiring the practice from the major biomedical research funding agencies.
A cell line is, by definition, a population of cells derived from a single cell. Inherent in this definition is the fact that all cells within a cell line have the same genetic makeup. Misidentification of a cell line means that the researcher is working with cells having a different genetic origin than what is needed! This can happen in a number of ways, from poor culture technique, to simple human error when labeling, to miscommunication when exchanging vials/flasks between colleagues. Cell lines are particularly susceptible to this because they are self-propagating and therefore can easily be shared amongst colleagues.
If you are conducting biomedical research that is focused on a specific cell type being used, then a misidentified cell line can be catastrophic to the results obtained. Your time and funding are wasted, your career may be at stake, and most importantly, the development of a life-altering technology can be derailed.
The first key step to protect your research is to obtain a Certificate of Analysis that verifies the cell identity. The second step is to prepare a master stock using dedicated media and growth supplements, and work only in a clean cell culture hood. These two simple steps, coupled with the documentation principles of GxP, are critical to correct identification of newly obtained cell lines. For cell lines that are already in your LN2 tanks, verification is simple and inexpensive. Fluorescence
PCR-based short tandem repeat (STR) analysis in combination with capillary electrophoresis can be used (this process is well described in an American National Standards Institute (ANSI) document). The true power of this method rests on decades of forensic science studies that use 18 different STR loci for human identity analysis. The data from these studies shows mutation rates from 0.01 to 0.28 percent for the loci now used for cell line authentication (http://www.cstl.nist.gov/strbase/mutation.htm). However, the mutation rates at these loci for cells in culture is likely much lower since culture cells do not undergo meiosis when STRs are subject to a greater rate of mutation2. Numerous commercial entities provide this service inexpensively for cell lines of human origin, and databases are available for laboratories opting to conduct the analysis in-house. Click here to learn more about Fisher BioServices lab services.
Awareness has certainly been raised, education is increasing, and a standard has been issued. Certainly, the field is moving towards reducing the problem of critical biomedical research being conducted on misidentified cells. However, the research community is far from free of the problem of misidentified cells for many reasons. Some critical aspects of the analysis to be aware of:
- Knowing the number of STRs at the standard loci is meaningless if you don’t know how many there should be. To authenticate a cell line, there must be a parental cell to compare against. If you are using the most commonly used cell lines, it is likely that they have been typed and entered into several databases for reference. However, be warned: the databases tend to be cumbersome and quirky and give inaccurate results. For example, entering an STR profile which contains a homozygous locus can yield two different results depending on how the homozygous locus is entered (see figure 1). The convention seems to be to enter the number of repeats from a homozygous locus once, rather than twice, separated by a comma, unlike how the number of repeats from a heterozygous locus is entered. But, without clear direction, the untrained or less experienced end-user cannot know that their results may not be accurate. Figure 1. In this example, the TPOX locus yielded only one peak at 8 repeats. When entering the TPOX homozygous result as "8,8" rather than just "8", the top 5 results returned are different. Also in this example, the data very likely represents a mixture of cells or 'contaminated' cell line. However, there is no alert to this fact if the end user is not familiar with the assay.
- The database algorithms only provide an analysis of the submitted sample against individual samples within the database, the analysis does not explicitly report if the data is consistent with a mixed population. If the test sample consists of a contaminated sample containing two cell-types within a single sample, the database analysis will not alert the end user to that possibility. End users are simply instructed that if their cell line is authentic, a match of ≥80% will be achieved. However, one can achieve 80% match with the expected cell type and still have a contaminated cell line. See Figure 1 for an example of how the results do not alert the end-user to a mixed population of cells.
- Most scientists will likely want to review the raw data provided by their core facility or by the vendor supplying the service. This can generate many more questions since the raw data does not always appear to correlate with the results output. There are numerous reasons for ambiguity in the raw data including, pull-up peaks, stutter peaks, and general noise (see the electropherogram in Figure 2 for an example).
Figure 2. Peaks observed by the causal observer are not called by the software used here while other minor peaks are called.
Several of my colleagues have discussed the importance of sample integrity (Dr. Barnes on freeze/thaw cycles and Abdul Ally on journey of a sample in a biobank lab), but If you have not verified the identity of your cell lines, you are jeopardizing your research, and everything that depends on it.
Nardone’s letter to Secretary Leavitt can be read at:https://www.lgcstandards.com/WebRoot/Store/Shops/LGC/MediaGallery/BLS/CLA_open_letter.pdf
1Brinkmann B, Klintschar M, Neuhuber F, Hühne J, Rolf B. Mutation rate in human microsatellites: influence of the structure and length of the tandem repeat. Am J Hum Genet. 1998 Jun;62(6):1408-15. PubMed PMID: 9585597; PubMed Central PMCID: PMC1377148.