U.S. flag

An official website of the United States government, Department of Justice.

When Missing Data Are Not Missing: A New Approach to Evaluating SHR Imputation Strategies

Award Information

Award #
Funding Category
Congressional District
Funding First Awarded
Total funding (to date)
Original Solicitation

Description of original award (Fiscal Year 2005, $30,918)

The Supplemental Homicide Reports (SHR) are widely used in criminological research, informing a broad range of research topics and subsequent policy applications. A serious issue with the SHR is missing information about the offender and incident in many recorded homicides. Although it is convenient to discard cases with missing data prior to analysis, it is not theoretically justified and can lead to incorrect substantive conclusions. Recently, several techniques for imputing missing SHR data have been proposed, but it is difficult to evaluate their effectiveness. Our proposed research will advance criminologists' ability to address missing SHR data by presenting a new approach to testing and evaluating SHR imputation techniques. This advance will allow research and policy to be informed by a more accurate picture of homicide patterns.

For several cities, police records are available (archived in NACJD) for homicides reported in the SHR. Offender data that are missing in the SHR are often found in the police records. We will take advantage of the fact that not all missing data in the SHR are actually missing. First, we can study similarities and differences between cases with known offender characteristics in the SHR, those with such information missing in the SHR but available in police records, and those with such information missing in both sources. Second, we can evaluate the different imputation techniques suggested in the literature. We will apply each of the imputation techniques to the SHR, and, for cases with information missing in the SHR but known in the police records, see how well the imputed values correspond with the values known in the police records. This will test the different approaches under real conditions, and provide the best guidance to date on which techniques are most effective.

To check that conclusions do not simply reflect idiosyncrasies of the available data sets, we will also apply the techniques to simulated data. Using a complete data set, we can simulate a missing data mechanism'specifying, for instance, how likely stranger homicides are to have missing offender information. The imputation techniques can be applied to the resulting incomplete data sets, with the imputation compared to the complete data. Exploring a variety of missing data mechanisms this way will help show which techniques work best under which conditions. In addition, we will attempt to use these investigations of real and simulated missing data to extend existing imputation methods and develop effective new methods.

Date Created: June 27, 2005