U.S. flag

An official website of the United States government, Department of Justice.

Probabilistic Characterisation of Baseline Noise in STR Profiles

NCJ Number
Date Published
November 2015
U. L. Monich, K. Duffy
There are three dominant contributing factors that distort short tandem repeat profile measurements, two of which, stutter and variations in the allelic peak heights, have been described extensively. This study characterizes the remaining component, baseline noise.
A probabilistic characterization of the non-allelic noise peaks is not only inherently useful for statistical inference but is also significant for establishing a detection threshold. This was done by analyzing the data from 643 single person profiles for the Identifiler Plus kit and 303 for the PowerPlex 16 HS kit. This investigation found that although the dye color is a significant factor, it is not sufficient to have a per-dye color description of the noise. Furthermore, this study shows that at a per-locus basis, out of the Gaussian, log-normal, and gamma distribution classes, baseline noise is best described by log-normal distributions and provide a methodology for setting an analytical threshold based on that deduction. In the PowerPlex 16 HS kit, there was evidence of significant stutter at two repeat units shorter than the allelic peak, which has implications for the definition of baseline noise and signal interpretation. In general, the DNA input mass has an influence on the noise distribution. Thus, it is advisable to study noise and, consequently, to infer quantities like the analytical threshold from data with a DNA input mass comparable to the DNA input mass of the samples to be analyzed. (Publisher abstract modified)
Date Created: May 7, 2017