Machine learning refers to the development of systems that can learn from data. A machine learning algorithm can, after exposure to an initial set of data, evaluate new, previously unseen examples and relate them to the initial "training" data. It is ideally suited for classification problems that involve implicit patterns, and it is most effective when used in conjunction with large amounts of data. Although machine learning has not previously been used in DNA mixture analysis, it is well-suited to such analysis because of two key problem characteristics. First, there is a large repository of human DNA mixture data in electronic format. Second, patterns in such data are often obscure and beyond the capability of manual analysis; however, they can be statistically evaluated by using one or more machine learning algorithms. The system was trained, tested, and validated using electronic data obtained from 1,405 non-simulated DNA mixture samples composed of 1-4 contributors and generated from a combination of 16 individuals. This report concludes that the proposed method for DNA mixture deconvolution, including determining the number of contributors, is a robust and reproducible method that was developed using an expansive AmpFISTR Identifiler PCR Amplification Kit. A description of materials and methods covers data acquisition and exportation, the locus-sample-specific threshold (LSST) calculation, data partitioning, feature scaling, feature selection, and machine learning algorithms. A more detailed discussion of the optimized system will be addressed in the Final Report. 10 figures, 8 tables, and 21 references
Downloads
No download available
Similar Publications
- ILIAD: A Suite of Automated Snakemake Workflows for Processing Genomic Data for Downstream Applications
- Development and Validation of a Method for Analysis of 25 Cannabinoids in Oral Fluid and Exhaled Breath Condensate
- Superhydrophobic Surface Modification of Polymer Microneedles Enables Fabrication of Multimodal Surface-Enhanced Raman Spectroscopy and Mass Spectrometry Substrates for Synthetic Drug Detection in Blood Plasma