This paper describes an efficient strategy for identifying and characterizing single nucleotide polymorphisms (SNPs) that show little allele frequency variation among populations while remaining highly informative; and it reports on testing this strategy on a broad representation of world populations.
The narrow range in the distribution of the average match probability across populations validates the low Fst strategy for identifying SNPs for use in forensic human identification. Although Fst depends on the specific set of populations studied, it is clear that a global set of DNA samples can be used to screen for markers with globally uniform Fst values. Selecting markers based on low Fst has the additional benefit of minimizing any differential effect balancing selection in a particular population or geographical region may have. With low Fst SNPs, whatever balancing selection may exist at any SNP must exist in all populations. The strategy used consists of four steps: first likely candidate polymorphisms are identified. Second these are screened on a few populations. Third the best of these markers are tested on many populations. Fourth the "best of the best" markers are retained. The "best of the best" are those with the highest average heterozygosity and lowest variation among populations, which are the most likely to be useful for individual forensic identification. Fst is used as a standardized measure of the variance in allele frequencies among populations. Regarding methodology, this report addresses screening criteria, marker typing, and analytic methods. 5 tables, 6 figures, and 23 references