Lab Manager | Run Your Lab Like a Business

The Cell Line Identity Crisis

Estimates of irreproducibility in published scientific studies range from about 50 percent to 90 percent

Angelo DePalma, PhD

Angelo DePalma is a freelance writer living in Newton, New Jersey. You can reach him at

ViewFull Profile.
Learn about ourEditorial Policies.
Register for free to listen to this article
Listen with Speechify

Short Tandem Repeats to the Rescue

Much has been written recently about “fake news,” and sadly big science is faced with its own brand of fakeness. Estimates of irreproducibility in published scientific studies—“fake science,” if you will—ranges from about 50 percent to 90 percent. Writing in PLOS Biology (, Leonard Freedman, PhD, president of the Global Biological Standards Institute (Washington, DC), estimated that U.S. research institutions waste $28 billion per year on nonreproducible preclinical research, with the total reaching $60 billion worldwide.

Irreproducibility results from cumulative errors associated with biological reagents, reference materials, study designs, laboratory protocols, and data analysis.

“There is no single magic bullet fix for this situation,” Freedman tells Lab Manager, adding that scientists must “own up” to the problem.

Misidentified and contaminated cell lines contribute significantly to irreproducible science. For decades, scientists relied on immortalized cancer cell lines for cell-based tests of drugs and other substances. As these cell lines expand and grow through succeeding generations, many undergo spontaneous genetic changes that affect their consistency and reproducibility. Sharing of cell lines among labs is widespread. “Contamination and misidentification errors therefore occur frequently, and persist for years or decades,” Freedman laments.

Thus, more than one-third of research lines are contaminated or misidentified or overgrown with other cells. Whenever a rapidly dividing line finds its way into a slow-growing culture it takes just a few weeks to completely overcome the “advertised” cells.

Freedman argues that despite the varied contributions to irreproducibility from factors that are difficult to control, cell line authentication (CLA) is a relatively easy, inexpensive operation that assures, with high confidence, that the cells a researcher believes she is working with are actually those cells.

The analysis of short tandem repeats (STRs)—also called short polymorphic DNA sequences or DNA microsatellites—has become the method of choice for human cell line authentication. The technique, commonly known as DNA fingerprinting, was borrowed from forensics. Its low cost, precision, and ease of execution makes it ideal for guaranteeing the authenticity of human research cells.

STR analysis examines the number and repeat frequency of between two and seven base pair units, which occur up to about a dozen times in one locus and signify a cell line’s uniqueness. The sequences are compared against database entries to identify individuals or phenotypes.

STR analysis is available commercially only for human cell lines because that is the only reference database that exists. “But in principle you could check authenticity over many generations for any cell type by creating your own database,” Freedman adds.

Related Article: Innovations Driving Next-Generation Sequencing and PCR

It is an easy analysis to run, but with the cost so low it does not make sense for most labs to gear up to run their own STR assays, and makes even less sense for older techniques like isoenzymology, karyotyping, and cytogenetic analysis. Single-nucleotide polymorphism (SNP) analysis also works and may be even more precise than STR assays. “But there are no commercial services for SNP, and unless your lab is set up to do SNP you’re probably better off sending samples out for STR,” says Freedman. Whole genome sequencing is another technique that may find utility one day, but it cannot compete on the basis of cost with STR.

Greater statistical relevance

Many real and imagined shortcomings of conventional STR analysis may be overcome by analyzing more loci. In 2012, the American Tissue Culture Collection (ATCC) workgroup recommended using at least eight STR loci for CLA (, which STR reagent vendors follow. For example, Promega (Madison, WI) has long offered the GenePrint® 10 system, for ten loci. The company recently launched GenePrint® 24, specifically for mixed sample analysis. With 24 loci, the product offers significantly higher discrimination for authenticating a sample’s identity.

With GenePrint 10, the likelihood of two individuals (excluding identical twins) having the same genotype is 2.9 × 10-9. With GenePrint 24, the likelihood is 6.6 × 10-29.

Doug Storts, PhD, head of research at Promega, notes that cultured human cells are typically unstable, as exemplified by changes in chromosome copy number (chromosome duplications and loss of heterozygosity) and other factors like small deletions, insertions, and point mutations. Cell line instability may lead to deletion/mutation events that cause loss of some of the STR markers, thereby lowering the probability of identity value. Higher numbers of markers increase the likelihood that you will uniquely identify the cell line despite the loss of markers.

“To accommodate this inherent instability, the standard recommendation allows variability at 20 percent of the alleles. For these reasons, and the increased number of new cell lines being derived, there is merit to achieving greater discrimination power by using a system with more loci,” says Storts.


The pathway to novel, authenticated cell lines does not always go through STR, particularly when the purpose of authentication is to establish a new cell line. In late 2016, Cellaria (Cambridge, MA) introduced two new cell models for ovarian cancer and one for breast cancer. Both ovarian cancer lines were derived from patients—one with aggressive ovarian cancer, the other with endometrioid ovarian cancer.

Cellaria used SNP genotyping to evaluate the cells’ genomic stability and concordance with up to 98% with the patients’ tumors. With each batch of cells, Cellaria provides a certificate of analysis that includes the expected growth rate, clinical history, and quality-test results, including STR profiling for cell line authentication.

“With STR, you typically measure no more than 16 positions in the genome to achieve a unique combination or fingerprint,” says David Deems, Cellaria’s president. “We offer this test to our customers and use it ourselves for routine, lot-specific cell line authentication.”

Compared with STR, SNP genotypes samples at a much higher density across the genome, in Cellaria’s case at around 100,000 locations across the genome in the tumor sample, and again in the resulting cell line. “We used this approach to assess whether the cell lines exhibited the high degree of genomic instability that plagues many cells when passaged repeatedly in culture,” says Deems. “While not 100 percent concordant, the high degree of concordance we see between the original tissue and the cell line demonstrates that our culture conditions are a step in the right direction. As a point of reference for this assay, identical twins have been reported as 99.98 percent concordant, whereas we have observed unrelated patient samples as low as 50 percent concordant.”

Cellaria is currently employing even higher-resolution analysis to track specific mutations from the original tumor through the cell model derivation process.

Hope and change

Given the strong scientific justification and low cost, has the tide begun to turn from lackadaisical indifference to authenticating cells?

“Slowly, but I believe it is,” says Freedman. “Funding agencies and journals increasingly require an authentication plan but it’s not always completely clear what that involves. Researchers are more aware of the problem, and I’ve heard that companies providing STR-related services are doing well. But it’s evolving more or less on the honor system.”

For additional resources on PCR, including useful articles and a list of manufacturers, visit