U.S. flag

An official website of the United States government, Department of Justice.

Assessment of Data Mining Methods for Forensic Case Data Analysis

NCJ Number
Journal of Criminal Justice and Security Volume: 8 Issue: 3,4 Dated: December 2006 Pages: 350-355
Anne-Laure Terrettaz-Zufferey; Frederic Ratle; Olivier Ribaux; Pierre Esseiva; Mikhail Kanevski
Date Published
December 2006
6 pages

This study selected applications of data mining methods and techniques on forensic case data, specifically on cocaine chemical profiles (with a focus on cutting agents) that provided relevant results within a criminal intelligence perspective.


The study's results revealed data on cocaine cutting agents of particular interest, because they resulted from a treatment that occurred toward the end of the drug distribution process. Their interpretation may then provide information on possible local illicit traffic networks. By focusing on the co-occurrence of a set of cutting agents, relevant patterns were detected. This approach, which uses graph theory, will be further tested on other crime data. The recognition of criminal patterns through data mining technologies can help police adapt their strategies and resource allocation to existing crime patterns and criminal methods. Although "data mining" is a widely used term subject to many definitions, it can be understood as the extraction of previously unknown and potentially useful information or knowledge from large datasets. The main principle is to devise computer programs that scan databases and automatically identify patterns found in the data. The potential of data mining technologies depends on the nature of the available dataset. Factors that assist in evaluating the relevancy of data mining techniques range from the activity that produced the dataset result to its quality (degree of uncertainty, precision, and completeness). The application of dating mining used in the current study focused on a dataset of cocaine seizures made in the canton of Geneva, Switzerland, during 1 year. The database created for this study contained the following variables: seizure location and time period, presence/absence of cutting agents, and combinations of cutting agents. The modeling process focused on the co-occurrence of cutting agents in the same sample and the persistence of an identical combination over time. 5 figures and 4 references

Date Published: December 1, 2006