Team MattMarifelSora: NIJ Recidivism Forecasting Challenge Report

NCJ Number

305044

Author(s)

Matthew Motoki; Marifel Barbasa; Sorapong Khongnawan

Date Published

2021

Length

22 pages

Annotation

This is Team MattMarifelSora’s submission to the National Institute of Justice’s (NIJ’s) Recidivism Forecasting Challenge.

Abstract

The team applied data processing and machine learning techniques to predict how likely it was that individuals would recidivate. This included applying hierarchical Bayesian target encoding and trained models that are known to perform well on binary classification and multiclass classification problems that involve tabular data. Following the industry standard in machine learning competitions, the team combined predictions from many models into an ensemble to boost the team’s score. In its work, the team used gradient boosted decision trees via the XGBoost and LightGBM libraries and created a custom MLP with skip connections using the PyTorch library. In addition, the team used the dreamquark implementation of a modern neutral network architecture known as TabNet, which takes advantage of attention mechanisms to selectively focus on input features. Further, the team tried NODE and SVM models; however, their performances were comparatively worse and not included in the team’s pipeline. Regarding efforts to reduce racial bias in predicting recidivism this was complicated by bias in initial arrests, since arrest data persist in data that informs recidivism.

Date Published: January 1, 2021

Team MattMarifelSora: NIJ Recidivism Forecasting Challenge Report

Downloads

Related Topics

Similar Publications

Team MattMarifelSora: NIJ Recidivism Forecasting Challenge Report

Additional Details

Downloads

Related Topics

Similar Publications