Selecting SNPs to identify ancestry

NCJ Number

255385

Journal

Annals of Human Genetics Volume: 75 Dated: 2011 Pages: 539-553

Author(s)

Joshua N. Sampson; Kenneth K. Kidd; Judith R. Kidd; Hongyu Zhao

Date Published

2011

Length

15 pages

Annotation

The goal of this project was to select from the millions of SNPs already identified in the human genome a small subset of SNPs that can predict ancestry with a minimal error rate.

Abstract

An individual’s genotypes at a group of Single Nucleotide Polymorphisms (SNPs) can be used to predict that individual’s ethnicity or ancestry. In medical studies, knowledge of a subject’s ancestry can minimize possible confounding; and in forensic applications, such knowledge can help direct investigations. The general form for the tested variable selection procedure was to estimate the expected error rates for sets of SNPs using a training dataset and consider those sets with the lowest error rates, given their size. The quality of the estimate for the error rate determined the quality of the resulting SNPs. Since the apparent error rate performs poorly when either the number of SNPs or the number of populations is large, this project proposes a new estimate, the “Improved Bayesian Estimate.” This project demonstrates that selection procedures based on this estimate produce small sets of SNPs that can accurately predict ancestry. A list is provided of the 100 optimal SNPs for identifying ancestry. (publisher abstract modified)

Date Published: January 1, 2011

Downloads

HTML

Selecting SNPs to identify ancestry

Downloads

Related Topics

Similar Publications

Selecting SNPs to identify ancestry

Additional Details

Downloads

Related Topics

Similar Publications