There have been previous efforts to statistically model the DNA extraction and amplification process while observing the effect on heterozygote balance (h), which refers to the ratio of peak heights (or areas) between the two alleles of a heterozygote. Comparisons of the predictions made by these models with empirical data are encouraging and suggest that at least the largest factors affecting the distribution of h have been identified. This paper reports on an attempt to refine the model.
The data collected allowed the researchers to model the joint distribution of h and the average peak of height at each locus. The data analysis conducted enables the construction of a Bayesian model that explicitly includes the variance relationship and allows intuitive inferences about the model parameters. The model is similar to that proposed by Tvedebrink et al., but the authors of the current paper take a Bayesian approach to model fitting rather than using maximum likelihood estimation. These two approaches, which yield similar results, are compared in the appendix of the paper. Through data analysis they identified what they regard as a satisfactory model. The difference in the number of repeat sequences between alleles has been identified as having a significant effect on the mean of h. The variance of h has been shown to decrease at a rate inversely proportional to the average peak height at the locus. This type of modeling may not address the root sources of variation. A better result might be achieved by modeling the variation of the peak heights directly with a constant component and a component that is proportional to the amount of template DNA. Such models are currently being considered by the authors and a number of other researchers. 4 figures, 3 tables, and 29 references