Award Information
Description of original award (Fiscal Year 2023, $513,529)
Short tandem repeats (STRs) have been extensively studied and routinely utilized as human identity markers in forensic genetics. Even the most comprehensive STR multiplexes used today are limited in their ability to determine the number and appropriation of component contributor genotypes in DNA mixtures. The difficulty in parsing out individuals in DNA mixtures is a consequence of overlapping length-based alleles genotyped using the polymerase chain reaction (PCR) coupled with capillary electrophoresis (CE). Many challenges exist in the resolution of minor alleles (i.e., the alleles originating from a minor contributor) from stutter and stochastic effects (e.g., inherent heterozygote peak imbalance and/or undetected alleles (drop out)) in a given DNA mixture. Therefore, we can either adapt to complex interpretation methods, or pursue new approaches to DNA mixture deconvolution. One promising area of research is in the incorporation of additional highly polymorphic STR loci to complement current multiplexes and offer a greater potential to improve the analysis of DNA mixtures. We propose to build on the data already generated from a past effort (2015-DN-BX-K067) and utilize massively parallel sequencing (MPS) to generate and characterize new STR candidates. MPS affords the ability to capture the unique sequence (within and around the repeat region) and the nominal length of STR amplicons, thereby increasing the power of discrimination of each locus. Further, MPS data provide an estimate of the quantity and quality of each component contributor in sequenced mixed DNA samples as each unique sequence in the mixture is reported in terms of coverage (or read depth; i.e., the total number of times an amplicon was sequenced). Using the autosomal datasets from 2015-DN-BX-K067 as well as on-going in-house research efforts, this study will propose a novel DNA mixture deconvolution workflow compatible with MPS-based platforms that will allow for genotyping of common casework samples for enhanced DNA mixture deconvolution. Each marker will be extensively vetted using known reference genomes (i.e., Genome in a Bottle (GIAB); SRM reference materials) and North American population samples. Further, noise, stochastic effects, and stutter will be modeled for each locus using a comprehensive series of sensitivity and specificity assays coupled with advanced bioinformatics modeling. Finally, likelihood ratio (LR) estimates using semi-continuous software will be used to assess how feasible the adoption of probabilistic genotyping will be to include both new markers (the proposed STRs) coupled with new platforms (i.e., MPS) as a forensic tool for complex mixture deconvolution studies. CA/NCF