U.S. flag

An official website of the United States government, Department of Justice.

Full mtGenome Reference Data: Development and Characterization of 588 Forensic-quality Haplotypes representing three US populations

NCJ Number
Date Published
January 2015
15 pages
Although investigations into the use of massively parallel sequencing technologies for the generation of complete mitochondrial genome (mtGenome) profiles from difficult forensic specimens are well underway in multiple laboratories, the high-quality population reference data necessary to support full mtGenome typing in the forensic context are lacking. To address this deficiency, The current study developed 588 complete mtGenome haplotypes, spanning three U.S. population groups (African-American, Caucasian, and Hispanic) from anonymous, randomly sampled specimens.
Data production used an 8-amplicon, 135 sequencing reaction Sanger-based protocol, performed in semi-automated fashion on robotic instrumentation. Data review followed an intensive multi-step strategy that included a minimum of three independent reviews of the raw data at two laboratories; repeat screenings of all insertions, deletions, heteroplasmies, transversions and any additional private mutations; and a check for phylogenetic feasibility. For all three populations, nearly complete resolution of the haplotypes was achieved with full mtGenome sequences: 90.3–98.8 percent of haplotypes were unique per population, an improvement of 7.7–29.2 percent over control region sequencing alone, and zero haplotypes overlapped between populations. Inferred maternal biogeographic ancestry frequencies for each population and heteroplasmy rates in the control region were generally consistent with published datasets. In the coding region, nearly 90 percent of individuals exhibited length heteroplasmy in the 12418-12425 adenine homopolymer; and despite a relatively high rate of point heteroplasmy (23.8 percent of individuals across the entire molecule), coding region point heteroplasmies shared by more than one individual were notably absent; and transversion-type heteroplasmies were extremely rare. The ratio of non-synonymous to synonymous changes among point heteroplasmies in the protein-coding genes (1:1.3) and average pathogenicity scores in comparison to data reported for complete substitutions in previous studies seem to provide some additional support for the role of purifying selection in the evolution of the human mtGenome. Overall, these thoroughly vetted full mtGenome population reference data can serve as a standard against which the quality and features of future mtGenome datasets (especially those developed via massively parallel sequencing) may be evaluated. This will provide a solid foundation for the generation of complete mtGenome haplotype frequency estimates for forensic applications. (Publisher abstract modified)
Date Published: January 1, 2015