U.S. flag

An official website of the United States government, Department of Justice.

Predicting Criminal Recidivism Using Specialized Feature Engineering and XGBoost

NCJ Number
Date Published
7 pages

This is one of the “Small Team” submissions for the 2021 National Institute of Justice’s (NIJ’s) “Recidivism Forecasting Challenge,” whose goal is to 1) encourage “non-criminal justice” forecasting researchers to compete against more “traditional” criminal justice forecasting researchers, building upon the current knowledge base  while infusing innovative, new perspectives, and 2) compare available forecasting methods to improve person-based and place-based recidivism forecasting.


The current report is a submission in the Small Team category of the challenge. Its goal is to use state of the art machine learning techniques to assist in this field. The team believed that since it was new to the criminal justice forecasting effort, it would be able to view this problem from a unique, interdisciplinary perspective. The aggregated dataset provided by NIJ for the Challenge contains approximately 26,000 individuals released from Georgia prisons on discretionary parole for post-incarceration supervision between January 1, 2013 and December 31, 2013. NIJ split the dataset into a training and test set with 70/30 proportion. In Round 1 of the competition, this team analyzed different model performances using 10-fold cross validation and found that on average XGBoost performed the best for the dataset and purpose; however, another significant boost was gained in terms of performance by structuring features based on the dataset. This feature, along with the other highlighted features in this report should be examined further to determine if they can be used to further improve the ability to predict recidivism.

Date Published: January 1, 2021