U.S. flag

An official website of the United States government, Department of Justice.

Better algorithms and chemistry for mixture interpretation

Award Information

Award #
Funding Category
Competitive Discretionary
Congressional District
Funding First Awarded
Total funding (to date)

Description of original award (Fiscal Year 2018, $694,525)

Many casework samples contain low template DNAs and/or are mixtures with a minor contributor. These samples can pose substantial interpretational challenges that can result in loss of valuable investigative information. Part of the challenge is due to the PCR process, which generates billions of molecules from as little as one molecule (or a few molecules), transforming a signal that is weak into one that is strong. The PCR process, however, adds a nontrivial amount of noise, noise that is perpetuated with each amplification cycle. Discerning inherent PCR noise and true DNA signal is a major impediment to the analysis of challenged samples. Massively-parallel sequencing (MPS) enables analysis at the primary level of DNA variation and thus should enable better resolution of signal from noise than that of capillary electrophoresis (CE). To understand the effects of noise on DNA interpretation first high-quality single-source samples should be analyzed. Once properly understood and modeled mixtures, containing minor contributors, can be assessed to perfect a model that better distinguishes noise and allows the realization of the full power of MPS for mixture interpretation. Improved predictions can be handled first bioinformatically, involving reanalysis of data with better algorithms and more appropriate statistical methods. More complete models can be created, leading to better inferences on extant data. A second strategy involves augmentation, wherein better data are collected to address the problem. These augmented data are used for the same predictive goals, perhaps with more appropriate algorithms. The first solution involves little additional chemistry, and is backwards compatible, while the second solution necessarily involves additional chemistry and is a forward-in-time solution. We propose to model stutter and other noise to better assess mixtures first by modeling extant data and second by use of unique molecular identifiers (UMIs) to perform analysis on stutter products and copying errors and with direct analysis of mixtures. UMIs target the original template DNAs, marking them with a unique molecular tag that is perpetuated throughout the PCR. Thus errors introduced during PCR and sequencing processes go from being opaque to transparent. Similar approaches have been successfully applied to somatic mutation detection at frequencies as low as 10-7 (though 10-3 is perhaps more appropriate) with such variants representing the ultimate “minor contributor.” These techniques applied to forensic-relevant STR markers will form better descriptions and predictions of stutter and noise. Algorithms will be developed to facilitate mixture interpretation of MPS-generated data.

This project contains a research and/or development component, as defined in applicable law, and complies with Part 200 Uniform Requirements - 2 CFR 200.210(a)(14).


Date Created: September 27, 2018