Since score-based approaches for computing forensic likelihood ratios for handwriting evidence are becoming more prevalent in the forensic literature, this article considers three of these approaches in detail.
When two items of evidential value are entangled via a score-function, several nuances arise when attempting to model the score behavior under the competing source-level propositions. Specific assumptions must be made in order to appropriately model the numerator and denominator probability distributions. This process is straightforward for the numerator of the score-based likelihood ratio, entailing the generation of a database of scores obtained by pairing items of evidence from the same source; however, this process presents ambiguities for the denominator database generation; in particular, how best to generate a database of scores between two items of different sources. The three score-based approaches reviewed in the current article differ in their approach to generating denominator databases, by pairing (1) the item of known source with randomly selected items from a relevant database; (2) the item of unknown source with randomly generated items from a relevant database; or (3) two randomly generated items. When the two items differ in type, perhaps one having higher information content, these three alternatives can produce different denominator databases. Although each of these alternatives has appeared in the literature, the decision on how to generate the denominator database is often made without calling attention to the subjective nature of this process. This project compared each of the three methods and the resulting score-based likelihood ratios, which can be viewed as three distinct interpretations of the denominator proposition. The goal in performing these comparisons was to illustrate the effect that subtle modifications of these propositions can have on inferences drawn from the evidence evaluation procedure. The study was performed using a data set composed of cursive writing samples from over 400 writers. The study found that when provided with the same two items of evidence, the three methods often led to differing conclusions, with rates of disagreement ranging from 0.005 to 0.48. Rates of misleading evidence and Tippet plots were both used to characterize the range of behavior for the methods over varying sized questioned documents. The appendix shows that the three score-based likelihood ratios are theoretically different, not only from each other, but also from the likelihood ratio; and consequently, each displayed different behavior. (publisher abstract modified)