U.S. flag

An official website of the United States government, Department of Justice.

Daubert-Inspired Assessment of Language-Based Author Identification

NCJ Number
Date Published
January 1998
60 pages
Both forensic linguistics and traditional document examiners agree that no evidentiary conclusions can be based on a single attribute and that courts cannot depend totally on language-based author identification techniques.
The primary danger of language-based author identification techniques is that justice may be subverted because certain ideas about language use may result in false identifications or false eliminations. Empirical findings from a study of language-based author identification techniques using a set of four writers extracted from a writing sample database indicate that the authorship of a document should not be decided based on common conceptions of language such as spelling errors and vocabulary words. Further, the study demonstrates that language experts who offer common conceptions evidence should not testify about the authorship of a document and that language experts should be used to counter other language experts who offer common conceptions evidence. Empirical results also show that two language-based author identification techniques, punctuation patterns and syntactic structures, are especially misleading. The author concludes that determining authorship from a fixed set of suspect documents is not the same as determining individuality in language and that an entire database representative of an appropriate sample of the general population would have to be analyzed and quantified for language-based author identification to truly develop. Legal options for introducing language-based author identification evidence are examined, as well as distinctions between common versus scientific conceptions of language, and a review of the language-based author identification literature is included. The validity of linguistic assumptions in evidence identification and analysis and the application of linguistics to forensic casework are discussed. 82 references and 26 tables

Date Published: January 1, 1998