Ask the Expert

Interpreting Forensic Evidence with Statistics

Interpreting Forensic Evidence with Statistics

Building statistical resources to support forensic examiners

Michelle Dotzert, PhD
Alicia Carriquiry, PhD.

Alicia Carriquiry, PhD, distinguished professor and president’s chair in Statistics, and director of the Center for Statistics and Applications in Forensic Evidence (CSAFE) at Iowa State University discusses how statistics can improve the analysis and interpretation of forensic evidence.

Q: What are the main objectives of the Center for Statistics and Applications in Forensic Evidence (CSAFE)?

A: The main goal of CSAFE is to develop statistical and computational tools to aid forensic examiners who analyze and interpret evidence in the areas of pattern and digital evidence. Pattern evidence examination in particular relies on the expert opinion of examiners, and often on the subjective assessment of a “match” between two items. At CSAFE, we aim to provide examiners with the means to quantify the similarity between items, and offer probabilistic statements about whether two items may have a common source. A related goal is the creation of resources (databases, software) for the forensics and legal professionals, for research and case work. Finally, we also work directly with labs on issues such as testing and training, and on effective methods to communicate the results of forensic analyses to the lay public.

Q: What are some of the challenges facing the forensic community?

A: A fundamental challenge is the lack of testable, statistically justifiable methods to reach conclusions in many forensic disciplines, in particular in pattern and digital disciplines. This is not due to a lack of interest by forensic professionals to include more quantitative methods in their analyses; it is just that inserting quantitative methods into the analysis of pattern and digital evidence is quite difficult, and requires the contribution of statisticians and other mathematical scientists. It has taken CSAFE, with all its resources and investigators, five years to begin to make inroads in some specific areas, and the same is true with other research groups in the US and elsewhere. Even if difficult, the work needs to get done. Forensic examiners have faced an increasing number of challenges in court in recent years. Most of the challenges seek to limit the strength of the conclusions that can be presented in a case, mostly due to the current scarcity of objective estimates of, for example, error rates in disciplines such as firearm and toolmark examination or footwear examination. These challenges will not cease until examiners can point to well designed studies, and reliable statistical methods to shore up their conclusions.

Q: How can probability and statistics be applied to pattern evidence?

A: Moving from a subjective, opinion-based assessment to a more objective, measurement-based approach to evaluate pattern evidence is challenging. There are many reasons for this:

  • Pattern evidence often takes the form of an image, with thousands of pixels, so standard statistical methods are not immediately applicable.
  • There is no general agreement about the features that an examiner ought to focus on during her evaluation, nor on how to measure those features. A possible exception is latent print examination, but even there, there is no agreed-upon number and type of features (minutiae) that must be present for the examiner to reach different conclusions.
  • There is little data about the frequencies with which each attribute is observed in a population of interest. Think, in contrast, about DNA, where there are databases of allelic frequencies among different sub-populations. These population databases permit the calculation of random match probabilities. There is no such thing for attributes in latent prints, or shoe outsoles, or any other type of pattern evidence.

Probability and statistics have several roles to play. First, they can inform the type and size of databases that need to be assembled to begin understanding population-level frequencies of important attributes. Second, statisticians can help develop methods to quantify the similarities and the differences between two items, and assess the significance of an observed level of similarity. So, for example, when comparing striations in two bullets, not only can we compute a similarity score to measure how “close” they are, but also estimate how likely it would be to observe that degree of similarity under the two competing hypotheses of same gun or different gun. To get to that point, however, statisticians and forensic professionals need to agree on a set of features or measurements that can be obtained from the various types of pattern evidence. One thing that is important to understand is that moving to a probabilistic framework eliminates the possibility of statements such as “100 percent certain” or “to the exclusion of all other shoes” or “to the degree of scientific certainty.” In the world of science and probability, there is always the need to quantify and report uncertainty.

Q: How can probability and statistics be applied to digital evidence?

A: Many of the issues that arise in pattern evidence also arise in digital evidence. An important difference, however, is that in the field of digital forensics the questions are many more and much more varied, and formulating those questions precisely enough is already part of the challenge. At CSAFE, we have a digital forensic program that is smaller than the program on pattern evidence. For now, we have carried out research on focused questions. Some examples are:

  • Can we tell individuals apart by looking at the time and duration of events including webpage visits and similar?
  • Can we detect the presence of “payload” in a .jpg file? This is an area known as steganalysis and can help find individuals engaged in, for example, sharing child pornography and other items hidden in innocent looking images.
  • Can we associate dark web user IDs to certain individuals? Persons who engage in illicit transactions in the dark web often use more than one ID and associated accounts and identifying whether the same person is behind more than one account can be important.

In many cases, investigators wish to inventory files, storage, and transmissions in a mobile device. But it is difficult for the examiner to determine whether or not she has found everything that was once loaded on the device. A new tool developed by CSAFE called EVIHunter has catalogued over 10,000 apps and produced a listing of the files, their locations, and the connections that are associated with those apps. The tool is expected to help examiners find what they need more efficiently. While there was no probability or statistics required to develop the tool, the statistical question now is whether certain types of files or apps can be indicators of certain types of crime.

Q: What techniques and technologies have made this work possible? How do you expect ongoing technological advances to impact this work?

A: CSAFE benefits from tremendous expertise of internationally recognized statisticians, computer and software scientists, mathematicians, criminologists, and other scientists. The broad body of knowledge that those scientists bring to the table enable us to think creatively about the questions brought forth by forensic professionals. We rely on the newer tools of machine learning and algorithms together with the traditional statistical ideas of sampling, estimation, hypothesis testing, and design of experiments. Ongoing technological advances contribute to an improvement in the accuracy and precision with which we can measure attributes in pattern and digital evidence, but good measurements are not enough to answer forensic questions. Very precise measurements have existed in areas such as trace evidence examination, but the statistical methodology to draw conclusions from those precise measurements are still in development. The message here is that we should not conflate good measurement with good statistics; good measurements are necessary but not sufficient for us to answer whether, for example, two bullets were fired from the same gun barrel. For that, we also need to estimate the probability that we would observe two very similar sets of striations even if bullets were fired from different guns. This is what statistics can help with.

Alicia Carriquiry, PhD, is distinguished professor and president’s chair in Statistics, and director of the Center for Statistics and Applications in Forensic Evidence (CSAFE) at Iowa State University. CSAFE is a National Institute of Standards and Technology Center of Excellence. Carriquiry is also a fellow of most of the major statistical associations in the United States and worldwide. She is an elected member of the National Academies and a fellow of the AAAS. In 2018, she became a technical advisor for the Association of Firearm and Tool Mark Examiners, and in 2020 was elected associate member of the American Academy of Forensic Sciences.