U.S. flag

An official website of the United States government, Department of Justice.

Population Genetics and Statistics for Forensic Analysts

Population Databases

Home  |  Glossary  |  Resources  |  Help  |  Contact Us  |  Course Map

There are numerous approaches to statistical interpretation of forensic DNA typing results. The approaches outlined in this module are advocated by the 1996 National Research Council Report (NRC II) and the associated formulas are provided in the Federal Bureau of Investigation's Combined DNA Index System (CODIS) software program. In addition to these methods of interpretation, there are several books and publications available that propose other approaches and provide more detail regarding the handling of forensic DNA evidence.01-07

Read NRC II.

Population Databases

Population databases allow for estimations of how rare or common a DNA profile may be in a particular population. As mentioned in the last module, defining a population can be difficult. When compiling a database, it is important to have a population size that is sufficiently large to capture the most common alleles, and one that will yield enough samples of the common alleles to permit  the calculation of reliable frequency estimates. Ideally, a database should contain several hundred samples; the National Research Council suggested using between 120-150 individuals with the expectation that this will yield between 240-300 alleles.07 However, 200 alleles has become the de facto minimum size for a database.05 In populations with no existing data, it may be possible to use data corresponding to other similar populations, depending on the context of case.

Watch a video on population databases presented by Greggory LaBerge.

The CODIS software, PopStats, has the following population databases:

  • Black
  • Asian
  • Caucasian
  • Hispanic
  • Native American

The allele frequencies in the populations were gathered from the CODIS 13 core short tandem repeat (STR) loci.08,09

Back Forward