This study measured face identification accuracy for an international group of professional forensic facial examiners working under circumstances that apply in real world casework.
Achieving the upper limits of face identification accuracy in forensic applications can minimize errors that have profound social and personal consequences. Although forensic examiners identify faces in these applications, systematic tests of their accuracy are rare. How can we achieve the most accurate face identification, using people and/or machines working alone or in collaboration? In a comprehensive comparison of face identification by humans and computers, the current study found that forensic facial examiners, facial reviewers, and superrecognizers were more accurate than fingerprint examiners and students on a challenging face identification test. Individual performance on the test varied widely. On the same test, four deep convolutional neural networks (DCNNs) developed between 2015 and 2017 identified faces within the range of human accuracy. Accuracy of the algorithms increased steadily over time, with the most recent DCNN scoring above the median of the forensic facial examiners. Using crowd-sourcing methods, researchers fused the judgments of multiple forensic facial examiners by averaging their rating-based identity judgments. Accuracy was substantially better for fused judgments than for individuals working alone. Fusion also served to stabilize performance, boosting the scores of lower-performing individuals and decreasing variability. Single forensic facial examiners fused with the best algorithm were more accurate than the combination of two examiners; therefore, collaboration among humans and between humans and machines offers tangible benefits for face identification accuracy in important applications. These results offer an evidence-based roadmap for achieving the most accurate face identification possible. (publisher abstract modified)