U.S. flag

An official website of the United States government, Department of Justice.

A Hybrid Machine Learning Approach for DNA Mixture Interpretation

Award Information

Award #
Funding Category
Awardee County
Congressional District
Funding First Awarded
Total funding (to date)

Description of original award (Fiscal Year 2014, $213,372)

As submitted by the proposer: This 18-month effort will address a central problem in the field of forensic DNA analysis, mixture deconvolution. Currently, a forensic scientist performs mixture interpretation using either manual or software-supported computational methods, each requiring significant time and resources. While the forensic community has explored and implemented means such as expert systems to address the issues, these methods still have limited capabilities and require significant resources. The current focus has now turned to continuous probabilistic approaches such as those utilized by software suites such as TrueAIIele (Cybergenetics) and STRmix (Institute of Environmental Science and Research/ Forensic Science South Australia). Despite innovative approaches to mixture interpretation, limitations still exist in large part due to the overall complexity of non-pristine DNA and lack of resources such as computational power, time and cost. The proposed method for mixture analysis will address these limitations through the development and optimization of a hybrid machine learning approach (MLA), combining an expert system with machine learning. This approach will combine the strengths of current approaches (computational /expert analyses) with those in Data Mining and Artificial Intelligence, avoiding the weaknesses inherent in using either approach in isolation. The development and validation of this technique will focus on the analysis of human samples that have been processed in accordance with a currently accepted forensic method, PCR amplification of the core thirteen human loci. The MLA will permit mixture analyses using diverse data types including DNA fragment data, amplification parameters, and a wide array of instrument parameters. This data agnostic structure will allow increased flexibility in adapting to analyses of new data types, such as next generation DNA sequence data. The study design and implementation will consider the computational/analytical needs, sample types, quality requirements and end utility. The design and usability will focus on requirements and limitations based on the needs of law enforcement and criminal justice communities, specifically forensic DNA scientists, policing agencies and the legal community. Syracuse University can facilitate this through both practical and theoretical experience within the group and collaborations with the Onondaga County Center for Forensic Sciences and New York City Office of the Chief Medical Examiner-Department of Forensic Biology, accredited forensic laboratories. The MLA will enable rapid and automated deconvolution of DNA mixtures of up to four contributors with increased accuracy compared to current methods. The final product (software) will require minimal computing and financial resources and provide increasingly informative, high confidence conclusions. ca/ncf
Date Created: September 11, 2014