This is an archive page that is no longer being updated. It may contain outdated information and links may no longer function as originally intended.
Challenge has closed
Thank you to everyone who submitted an entry. Winners will be notified by August 16, 2021, and posted online.
Winners are to submit paper outlining the variables that were tested, indicating which were of statistical significance and which were not, by September 17, 2021.
DARYL FOX: Good afternoon, everyone. Welcome to today's webinar. NIJ's Recidivism Forecasting Challenge, hosted by the National Institute of Justice. At this time, I'll go ahead and pass it over to our presenters for today. Angela Moore will be starting things off with the introduction and some background information. Angela?
ANGELA MOORE: Thank you, Daryl. My name is Angela Moore and I serve as a Senior Science Advisor at the National Institute of Justice. And again, I would like to welcome you all to this webinar on the Recidivism Forecasting Challenge. Thank you for taking the time out of your schedule to join us today. We hope that if you are interested and eligible, you will apply to do this important work with us. Today, I am joined by my colleagues Dr. Joel Hunt, a Senior Computer Scientist; Dr. Marie Garcia, a Senior Social Science Analyst; Mr. Eric Martin, a Social Science Analyst; as well as Mr. Michael Applegarth, one of our research assistants. Today, we will provide you with information about the Challenge, and hopefully we will be able to answer any questions that you have regarding it at the end of the presentation.
But, before we dive into the specifics of the Challenge, let me first tell you a little bit about NIJ. The National Institute of Justice is a research, development, and evaluation agency of the U.S. Department of Justice. We are dedicated to improving knowledge and understanding of crime and justice issues through science. In short, our goal is to strengthen science in order to advance justice. We invest in scientific research across disciplines to serve the needs of the criminal justice community. Our research regarding data science has focused on analysis and visualization of crime data, as well as facilitating the implementation of computer-based crime mapping and analysis in American policing. Our history as it relates to data science research is as follows. We gave our first grant award in 1986 to The University of Illinois at Chicago to explore crime mapping in the context of community policing. In 1990, we initiated the Drug Market Analysis Program. Then, in 1997, we funded the development of CrimeStat, as well as established the Crime Mapping Research Center later renamed the Mapping and Analysis for Public Safety Program. In 2009, we began exploring the potential of crime prediction and forecasting. Then, in 2016, we issued the Real-Time Crime Forecasting Challenge. And today, here we are with the Recidivism Forecast Challenge.
ERIC MARTIN: So I'm going to discuss the goals of the Challenge. Our main goal is to enhance recidivism forecasting using person- and place-based factors. Some of you are probably familiar that our risk assessments have gone through a number of iterations over the years when we started out just using tacit, professional clinical judgment, then went to actuarial forecast using administrative data and static factors. And then the next generation started to include dynamic factors and captured more real-time or current status of the individual on probation or parole, such as if they complete a specific reentry program. Now, we really want to push the envelope of forecasting further and see what data is out there that can be incorporated into the forecast. We want to understand how supplemental data can be integrated with the official records to really just enhance the precision and increase the accuracy. And then the second goal is to really increase the accuracy of risk predictions for all individuals under supervision, not just the individuals we have the most data on, but everyone coming into Department of Correction Custody, so that they can have the most accurate risk and needs assessments available.
So why do we need the Challenge? One: practitioners, especially in community corrections, are facing increased caseloads. We need to understand who is most likely to recidivate. We have broad risk categories that come with every risk assessment employed, but we're thinking that it may be possible to even go further and that practitioners can triage within these risk categories to really provide tailored services for those who need them most, those who really need to be addressed, or are at high risk of recidivism. And also, we know based on previous research that when there is an incongruence of predicted risk to the actual risk, it actually may harm those on probation and parole, as far as their likelihood of recidivating. For those who are low risk and are put in their high-intensity supervision program, research has shown it may actually increase recidivism. So, it's very high stakes situation that Corrections Department states and we need to have the best models possible to really assist them. And then as I said earlier, we need to meet the needs of all supervisees. Research has shown that conventional risk assessment tools tend to predict risk better for White men. And so, as a group, we structured the prizes in the Challenge to encourage robust prediction that are accurate across all racial groups. And we understand that there are unique needs and risk for females on probation and parole. So, we really want to serve a community supervision practice--practitioners to have accurate risk assessments for the female population. We want to be able to meet their unique criminogenic needs. So again, the Challenge encourages submissions of different models for males and females on parole.
MICHAEL APPLEGARTH: Before going over the Challenge data set, I just wanted to make a special thank you to Dr. Tammy Meredith who played a major role in helping us prepare the data for the Challenge. In the Challenge, the data is coming from the Georgia Department of Community Supervision and the Georgia Crime Information Center. The data contains over 25,000 individuals who were released from prison to parole in the State of Georgia and were under supervision starting January 1st, 2013 through December 31st, 2015. The data that is released, it's going to be released in three waves: Friday was the initial data release; a month from then, a year two update will occur; and then two weeks following that, a year three update will occur. Seventy percent of the data set was released as training data meaning you're able to see all of the independent variables, as well as the dependent variables. So, you're able to see if the individuals recidivated during the year one, year two, or year three. The remaining data was released as test data. And this is the data you're going to use for the forecast and submit for judging. And some information regarding the data, the initial released data does not contain the supervision activity information. We did this intentionally as when individuals are first released onto supervision, the community agencies aren't going to have this type of information until that individual has been under supervision for a particular time period.
So, during the year two update two things are going to occur. Those who recidivated during the first year are going to be removed from the data set and then the supervision activities are going to be included, and then at year three update, those who recidivated during year two are going to be remove and you'll have the updated data set to work for, for that final time period. To give you an idea of what's going to be included in the variables, I think you can see the key domains here. You can also view the Challenge codebook on the Challenge website, so you're going to have access to information such as age and gender, how long they were incarcerated for, prior arrests, and criminal history, as well as prior community supervision history. And then you'll see conditions of supervision as well as supervision activities such as employment, or treatment, and drug testing. For recidivism, the recidivism is defined in this Challenge as an arrest for a new crime while they were under supervision. We are encouraging the utilization of additional data. You know, part of, as Eric described, one of the goals is to release as much information as possible to improve the accuracy of that.
There are several lists of potential supplemental data on the Challenge website, and I wanted to highlight a couple here. One resource that is just good to know about in general is the National Archive of Criminal Justice Data, which allows you to see various data sets related to criminal justice on populations. Another important one that we wanted to ensure that we could incorporate for the Challenge was the American Community Survey. So, we've incorporated PUMAs, which is a geographical measurement that the census uses during their collection of the survey. And so we have included the PUMAs into the data set. You should note that to help reduce the risk of deductive disclosure some of the PUMA categories were combined together, especially in the more rural areas of Georgia and that has been documented on the Challenge website as well so you can see which PUMA units were combined together.
To enter into the Challenge you must submit your entry into the appropriate category, and we have three separate categories. So, for students, this is included for high school students or full-time undergraduate students. Small teams and businesses category is comprised of one to ten individuals, and so these are going to be graduate level students, professors, or any other individuals that wouldn't fall under that student category and they are free to compete individually or to form small teams. Additionally, small businesses are going to be in this category which is businesses made up of less than 11 employees. And then, finally, the last category is large businesses with 11 or more employees. It should be noted that students, if desired, can submit their forecast into the small teams category or the large business category, and small teams if desired can submit their forecast into the large business category. However, large businesses are not allowed to submit their forecast in the small teams and/or student category and small teams are not allowed to submit their forecast into the student category. When submitting you must also include a team roster. On that team roster there's going to be three pieces of information, each individual's first name, last name, and then the percentage of winnings if that forecast is found to receive an award. On the team roster as well, you may lose or gain members throughout the Challenge on mere reason that team roster needs submission and so you may gain a member from period one to two or may lose one. The only requirement is an individual is only be able to be allowed to be named on a single roster for each period. So you're not able to be on multiple teams or have multiple entries.
When you're submitting your forecast, they cannot be in a zipped folder. The forecast file must be named under the following naming convention, so you're going to have your team name and then the year of the forecast. Within that file of the forecast, you're only going to have two fields. So you're going to have the ID field which is the original ID field for all the individuals within the data set, and then the second field is going to be the probability score of each individual recidivating during that Challenge period. And so you're only going to have those two filled, and it's important to label them ID and probability as they are going to be read in via program, and if those fields are not labeled that then your submission is not going to be read in. And if you are running separate forecast for male and female individuals in the data, make sure to combine them all into one file. When submitting the team roster you should do the following naming convention as well, as your team name and then the year of the forecast and the roster as concluding that. It's also important to know that you're not required to submit forecast for each Challenge period so you may choose to submit just for the first or all three or one of the latter ones as well. Entry is free; there's no registration or a participation fee, and for further instructions on how to register or submit you can go to the Challenge webpage. Eligibility, all U.S. residents and U.S. territories are eligible to participate. You must be 13 years or older at time of entry to participate. For those who are under the age of 18 you are required to have consent from your parent or legal guardian and for companies to participate they must have U.S. business license. Employees of NIJ and individuals or entities listed on the federal excluded parties list are ineligible to participate. Employees of other federal agencies should consult with their Ethics Officer concerning their eligibility.
JOEL HUNT: So before I jump directly into the Judging Criteria I wanted to start out with a little bit of discussion saying that we are not trying to endorse these metrics as the necessarily the correct metrics. However, with any Challenge you need to have a transparent set of metrics so that it's clear to all of the contestants how they will be measured and how they will be scored. And with the way we're running this Challenge we felt as though the following judging criteria would be the most appropriate given the circumstances of how open this Challenge is. So, for the majority of the prizes, we'll use the Brier Score to measure the accuracy of someone's forecast. The Brier Score is essentially the average of the sum squared errors. Right? It's one over N; in this case N is the 30% percent test data set, and it's what you forecasted minus what the actual is--the actual being zero or one, yes or no, did they recidivate, squared. Since this is a measure of error, applicant should look to minimize their metric, right? The lowest score wins, whoever has the least amount of error will win, and is what will determine the winners.
The second criteria is we're going to use is create a confusion matrix based on the prediction probability that you provide. We will use a 0.5 cut point. So any forecast that is 0.5 or higher will be considered you forecasting that individual to recidivate, anything lower we will take as you saying they will not recidivate, so then we will create this confusion matrix and we will compare the false positives--false positives between rates, between the Black--the subset of the test data that are Black and the subset of the test data that are White. And we'll look at the absolute value difference between those values and we'll take one minus set as a fairness penalty. If you go to the next slide please. So when we look at the Fair and Accurate, here we're trying to look at high scores. So instead of looking at the error or what we said, one minus the error is essentially how correct you are. So one minus your Brier Score times this fairness penalty is how we will look at the Fair and Accurate for the set of six prizes for fairness and accuracy trying to reduces the racial bias. We could've looked at false negatives instead of false positives, it would've been just as correct. However, as a team we decided that false positive are probably more of what we're interested in trying to reduce, so we want to put more emphasis on that. And one of the things that we push is for the winning submissions we're going to ask them what alternative measures should you have then considered using in this competition; things like that because, again, we're not trying to say these are the correct ones but these are what we're going to use for this.
So there are 114 prizes that will be awarded. Each contestant or team may win up to 15 prizes. That would be the--yeah, you could win six different prizes for the racial and how fair and accurate it is, and you can win nine prizes for the overall accuracy, the most--for males and females and then a combined score. So here's the overall prize structure. As you can see as you--as the categories go up--the student categories there are only prizes for the top three places. The small team, the top four, and I believe the large team is top five and then for the Fair and Accurate, that prize category is for overall between the students small and large, it's--whoever does the best overall for those prizes. So that's how we came up with the potential of 15. So in the student category on just the fairness if you took first across the board you could win $45,000 plus potentially money in the Fair and Accurate. On the small team, you could potentially win $90,000. And the large, I think is $225,000. Yeah. Plus then there's an additional $75,000 for racial bias.
MARIE GARCIA: Okay. Thank you, Joel. After the prizes are awarded, NIJ will make available all of the winners' papers on our website, so the paper requirement is a comprehensive document describing the lessons learned, the variables that were found to be significant predictors of recidivism in the forecast, and the types of models that were found to outperform other models. Contestant are encouraged to provide measures that could be used in the future, and any additional intellectual property such as specific techniques, weighting, other decisions, and if possible uploading their code to an open source platform. So there are several key dates to keep in mind. As you all know, NIJ released the training data on April 30th. This May 31st, at the end of the month, will be the end of the submission period for period one. June 1, we will release the updated test data, specifically year two data. June 15th, two weeks later is the end of submission period two. The following day on the 16th is the release of the final test data, which is year three. And two weeks later will be the end of submission period three. Now, we plan to make our decisions over the summertime. So, on or before August 16th, NIJ will announce the Challenge winners. And before the 17th of September, all winners must submit their Challenge documentation, their paper to NIJ so that we can make it public. Again, for substantive questions about the Challenge, you can e-mail NIJ at the following e-mail address, [email protected] Remember, this is for substantive questions about what NIJ is looking for in this Challenge. Now, if you have technical questions, you should e-mail the OJP IT Service Desk at the following e-mail address. And again, this information, these e-mail addresses are provided in the Challenge document. So now, we will move on to the question and answer session. And just as a quick reminder, if we do not get to your questions today, you can absolutely send it to the Challenge e-mail address that was provided in the previous slide. So we will go ahead and get started with our question and answers. And the first question for the team is "Can you provide a sample submission file? It was unclear if NIJ would like gender and race to be included in the submission file. Would you like it to be ID probability or ID gender, race, and probability?"
JOEL HUNT: So, I'll quick jump in on that. When we initially wrote the Challenge, we were trying to work with our chief information officer's team to provide a platform where you could submit two separate files, one for male and one for female. Unfortunately, they were unable to get that to work in time at the time of release. So, we ask that you only submit one singular file with males and females all combined, and then we'll know by the ID which ones they are. So the file should only have ID and probability on it.
MARIE GARCIA: Okay. Thank you, Joel. And that question came in a couple of times. So hopefully, that will clarify for the others that also asked that question. The second question I have for the team is "Will NIJ be providing examples of what we mean by supplemental data?"
JOEL HUNT: So, on the Challenge document itself, and Michael referred to some of it earlier, you can use whatever additional data set you would like. If you want to combine census data based on the combined PUMA location, if you want to start trying to figure out different economic measures, any additional data that you can combine to that data set, you are welcome to do so. And a decent list is already on the Challenge website.
MARIE GARCIA: Okay. Thank you, Joel. The next question was answered. The question was, "Are data limited to the Georgia data set?" And Joel just explained that we will welcome any and all supplemental data that are appropriate, so thank you for that. And the next question is "Will this opportunity--does this opportunity apply to a non-profit that supports first responder's mental health?"
JOEL HUNT: Our eligibility criteria are based purely on citizenship status or the location of the non-profit or organization or whatever type of entity it is, having a location in the United States to submit from. What kind of work these entities or individuals do does not make any kind of difference in terms of determination.
MARIE GARCIA: Okay. So there's a clarification question asked, "Specifically the data set was released or it will be released? Can NIJ clarify if the data for period one have been released?"
MARIE GARCIA: Great. There's a question about the third wave. "Specifically, is the outcome to recidivism in year three or is it recidivism within three years as data includes both variables?"
JOEL HUNT: So the training data set has did they recidivate during three years plus the three individual years, did they recidivate? Did they in one, did they two, year three? In the test data, what we're asking for in the first submission period is yes or no, did--or no, sorry, not yes or no. What is the probability that they recidivated during year one? After the year one period is done, we will remove those individuals who did recidivate from the test data set and re-release the test data set with just those who did not. And ask again, did they recidivate then during year two? It's also during this year two point that I believe its sixteen additional variables become available talking about the supervision activities. And the reason those variables don't come until year two is we wanted it to be a little bit more realistic to what Community Correction Officers would experience that they won't have those supervision activities when someone is very first released. So, that's why those variables don't come until test set two.
MARIE GARCIA: Great. Thank you, Joel. One of our participants asked about social and emotional issues. Are these relevant variables for the model?
JOEL HUNT: We're not here to say what is or isn't relevant. We were able to--we try to provide as much information as we're able to get on these individuals. We think a lot more variables are probably highly important, however, we don't know which ones that's somewhat the purpose of this Challenge. So, what we provided is what we could provide if you're able to find additional data to supplement your algorithm, so you are welcome to use whatever additional data you can.
MARIE GARCIA: Great. Thank you. Okay. "Can NIJ please explain why arrest is used as the metric for this Challenge?"
JOEL HUNT: Well, I will start this answer and I'm sure someone else will jump in on the team because correction is not really my specialty. I'm much more on the police inside. I would just like to clarify that it is arrest for a new charge. These are not arrests due to violations, which also occur, but those are not what we use as our metric. If Marie, Eric, or Angela would like to jump in to discuss a little bit more on that.
ERIC MARTIN: Thanks, Joel. This is Eric. And yes, we acknowledge there are many ways to measure recidivism and there's currently a debate in the field on how to measure it and, you know, there's valid points on multiple sides. We, as Joel said, chose arrest for a new crime because we felt that's what many practitioners are trying to prevent. They want to, you know, help their supervisee have a successful reentry. And that was also a metric that we felt the data was reliable and accessible. So we made that decision. I hope that answers your questions.
MARIE GARCIA: Great. Thank you, Eric. Okay. So our next question is, "Will the Challenge dataset include any location information tied to the correlate including their place of residence?
JOEL HUNT: So when we very first received this dataset, we had the initial address that they proposed to be released to. And because of data privacy issues, we immediately knew that was something that could not be released. However we want to allow some type of geographic variable to be present so that people could start to bring in other types of environmental or economic type of variables or opportunity variables, things like that to their model if they would like. What we finally ended up doing was taking the individual's address and mapping it out against PUMAs, which are a census unit used to aggregate at least 100,000 individuals together. So they're very similar to census blocks or census tracts, but they're larger, there are a hundred thousand individuals. And even when we looked at that level of aggregation for the determination of place, we still felt as though that it would still potentially lead to risk of disclosure of who individuals were. So we went a step further, and we combined these PUMA units together. And that's what we finally used was a combined PUMA. And the information on what PUMAs are combined is an appendix to the Challenge so that people can see that if someone is identified to live in unit three for their combined PUMA they can go and look to see what PUMAs were combined to form unit three.
MARIE GARCIA: Great. Thank you. And this question was, I believe, answered but I think it might be worth discussing quickly again. "Will--are parole violations used as a--as a metric for recidivism"
ERIC MARTIN: It took me a while to unmute myself. Good question. Again, we made the decision that recidivism will be measured by arrests for a new crime. So that really made it clear and transparent because we know there may be a number of ways people measure technical violations and, you know, what constitutes or triggers the technical violation. So we though those very clear and transparent just to do arrests for a new crime.
MARIE GARCIA: Great. Thank you, Eric. We have a question about eligibility. "Are medical students eligible for this challenge?"
JOEL HUNT: They are not eligible under the student category, though they can come in under the small team, small business, or if they would like the large team, large business.
MARIE GARCIA: Okay. Thank you. And we have another question about edibility. "Can a small team submit entries for both small team and large businesses category? So two entries, one for each category?"
JOEL HUNT: No, they must select one. It is their choice which one. Either small or large, since they're a small organization. They can move up. But they cannot submit to both. It's one or the other.
MARIA GARCIA: Great. Okay. "After the competition, will successful awardees be able to use the analysis that they've provided to NIJ in a peer reviewed manuscript?"
MARIE GARCIA: Great. There was another question about some clarity with what was requested in the submission file. "Just to clarify, do the applicants need to include male and female in the submission? Or just ID and probability?"
JOEL HUNT: So the only thing that--the only two variables that will be in your submission is ID and probability. And you should have every male and female in one single file like that.
MARIE GARCIA: Okay. So we have a question…
JOEL HUNT: Okay. And we will--sorry. Look, and we will update the--we will update the website to clarify that because like I said, it was a technical issue that prevented us from allowing the two separate submissions for them, which is why we're now saying to combine them into a single one with just the two fields.
MARIE GARCIA: Okay. And we have another question about eligibility. Specifically "Is citizenship required? For example, can people with permanent residency, a green card, or students with a student visa also participate in this challenge?"
JOEL HUNT: This is one where I'm going to have to point you to the actual eligibility requirement. This is somewhat lawyer speak. I would look at it if you still have additional questions after looking at the actual document. Please email the NIJ Recidivism Challenge email address, and I will then direct it to the lawyers who are involved and let them make a final determination.
MARIE GARCIA: Okay. And next question. "Are the prediction results of an individual, whether or not they recidivate, used to determine whether or not the individual is granted community service or parole?"
JOEL HUNT: So these data are historic data. These are all individuals who were initially released under supervision between January 1st, 2013 and December 31st, 2015. And we followed them for three years. So these are individuals who, at the very latest, we follow-up--finished following them December 31st of 2018. Whether or not the state continued any type of supervision or things like that is unknown and not part of it. So everything we're looking at is historic data and will have no impact on those individuals. Not to mention we used a random ID, so these forecasts cannot be tied back to these individuals after the fact.
MARIE GARCIA: Okay. The next question comes from a participant who has downloaded the initial release of both the training and the testing dataset and found that there were 16 variables under supervision activities missing in the test data. Will these variables be provided?
JOEL HUNT: Those are the ones that do not become available until test dataset two, that I mentioned earlier. Those are things like how often they're drug tested and what percentage of those are failures and their employment status; things like that. And that's where I was talking about how it nears a little bit more of what community correctional officers go through and that, when they are originally assigned someone, they're not going to know what percentage of times individuals are going to fail drug tests or things like that. So that's why we held up providing those data until test data two. You get them immediately in the training data, but realize you won't in the test until the second phase.
MARIE GARCIA: Okay. We have a question about the confidence scores. Specifically how the judgment--I'm sorry. The question is: “why are confident scores not being required here?”
JOEL HUNT: Yeah, it was something we talked about initially, and it was something I was somewhat pushing for originally. And it was just determined that we did not want to force it and make it a requirement of everyone in order to compete that they provided it. Yes, it would be nice information and, yes, we highly encourage people if they would like to supply that to us, they can. I'm not exactly sure how at the time of submission. But it's definitely something for the winners when they're writing their paper that would be a topic we would want them to discuss. It was just determined that overall, we did not want to make that an entry barrier and entry requirements for individuals to compute those scores.
MARIE GARCIA: Okay. "How are Hispanics and Latinx individuals categorized in the dataset?"
JOEL HUNT: So our dataset was initially we had everyone who was initially under supervision in the state, and we then decided because of the small counts to limit our dataset to only those individuals who the state identified as Black or White to remain in the dataset; how the state made that determination is unclear whether or not Hispanic or Latinx or any kind of ethnicity or things like that played a role. We purely relied on the race category that the state provided.
MARIE GARCIA: Okay. Thank you. Another question about submission here is, "Can an applicant submit per year one, two, and three during submission period three, or can you only submit per year one in submission one, year two for submission two, et cetera?"
JOEL HUNT: You can only submit for the period that's open, so you can only submit year one during the year one phase. You can only submit two during the year two phase, and three during the year three phase. Otherwise, once we release the second dataset, it would become more than abundantly clear who did and did not recidivate because we removed those who did recidivate from the dataset. So you can only do it during the appropriate submission period.
MARIE GARCIA: And thank you, Joel. Another question about submission. "Does a team have to compete in all three periods, or can they compete in only one or two periods?"
JOEL HUNT: It is up to you how many you do or don't compete in. If you'd like to do just year one, do just year one. If you want to do one and two, or one and three, or two and three, or all of them, entirely up to you.
MARIE GARCIA: Okay. The next question, "Is there a way to view the target dataset variables and structure without seeing the actual data?"
JOEL HUNT: There's a codebook that's attached to the challenge that should give you the vast majority of that information before you get to the data.
MARIE GARCIA: Okay. Okay. So another question about the data and the supplement database can they use. "Can any ID in the challenge dataset be linked to the term record in the National Correction Reporting Program dataset?"
JOEL HUNT: The short answer is no. All ID variables that we initially used have been completely removed and then random IDs were generated for these individuals.
MARIE GARCIA: Okay. Okay. In terms of submission, we have a question about groupings here. "Can a group of graduate students be considered as a small team between one and ten individuals?"
JOEL HUNT: Yeah. You can--so a group of grad students can come in under small or large, their choice.
MARIE GARCIA: Okay. Great. The question about the papers. "Will the paper require the winners be published anywhere? And I believe the answer to that question is they will all be made publicly available on the NIJ website. So yes, they will all be made available at a later date. Okay. So next question. "Should applicants include predicted recidivates and class membership as well as predicted probability?"
JOEL HUNT: They are only to provide us the ID and the predicted probability for that individual for that time period.
MARIE GARCIA: Okay. "Can NIJ please clarify what the racial bias category means? It was not clear in the document or the webpage."
JOEL HUNT: So the racial bias category, those six prizes are across all entries. It's not broke out by students small or large. And what it is, is where you're going to take one minus your Brier Score, which should be essentially how correct your forecast is. And we're going to multiply it by a fairness penalty. And that penalty is one minus the absolute difference between your false positive rate between Blacks and Whites. And that's done by us using a .5 cut-up point to say whether or not yes or no, would you have forecasted that individual to recidivate.
MARIE GARCIA: Okay. Thank you, Joel. A couple--a two-part question here, multiple parts. "Are there limitation on the number of submissions that someone can submit per day? If so, what are they and how quickly will test…"
JOEL HUNT: Okay. You are only allowed one submission for that time period. It's not per day or anything like that. You can only submit one entry for time period for forecasting recidivism during year one. You can only have one for forecasting recidivism year two and then one for forecasting three. And that--if you're an individual or you're a part of a team, you can't be part of three or four teams and try to do it that way. That's why we also ask for team rosters. You can only be part of one submission for each of these periods.
MARIE GARCIA: Great. Thank you for clarifying. We have someone who wanted to know if the webinar would be available later and yes, it will be available on the NIJ website for you to review at your leisure during the challenge timeframe. So thank you for that.
JOEL HUNT: I believe it's just the transcript is.
MARIE GARCIA: Just the--how about the slides? Yes, the slides will…
JOEL HUNT: So, yeah, the slides and transcripts will be posted in approximately five days on our website.
MARIE GARCIA: Okay. Great. And next question, this was sort of discussed but I think it's worth likely repeating. "What did NIJ do to maintain data privacy? For example anonymizing the data, differential privacy, et cetera."
JOEL HUNT: Yeah. So this was a really important topic for us and we spent probably six months on this process, and there was a team of five of us who were involved and included going before an IRB. We obviously started with highly identifiable information and the very first thing we did was we started to explore where some of our weak points are, where were some of our smallest categories? That's what led us to remove individuals who are not Black or white is; there were just two far few of those racial counts in our dataset. We started looking at our age ranges, and we started then doing age ranges and things like that. We did a lot of analysis and we started looking at how many different unique combinations of individuals existed as you started to have more and more variables making it and basically running a disclosure risk analysis. We got to the point internally where we felt as though we have finally done enough to minimize the risk of disclosure. And at that point, we actually brought on the National Archive's Criminal Justice data to conduct an independent risk disclosure analysis of all the data transformations we had done. And between our determination and their work, they agreed that we had done enough, and we went before the Office of Justice Programs IRB and presented them all the information on what was done and what our findings were, and the IRB had been, at that point, agreed that we had met the threshold in order to publicly release this data that the risk of disclosure had been minimized to such a point that the benefits of this word greatly outweighed the minimized risk.
MARIE GARCIA: Okay. Thank you, Joel. The next question is a follow up comment, you know, for NIJ to respond to specifically that without location information the data may not mirror the real world challenges of trying to predict recidivism in that they do not allow for a specification of place-based risk factors that could amplify or suppress risk. Any thought on the lack of location information and how that could be perhaps adjusted by the schedule applicant?
JOEL HUNT: Yeah. So we do provide one piece of location information and that's the combined PUMA. And obviously, it is not at the granularity most of us as data scientists would like. However, in order to maintain privacy and minimize the risk of disclosure of who these individuals are, we had to go to this level of granularity. We hope that it still provides you some level to start to provide contact such as urban environment versus rural environment to kind of understand, maybe can start to calculate like average distances to different types of things based on whether or not they're rural or urban, things like that. So we provided what we could, and I think we, as data scientists, always want more, and we always want better quality; however, the openness of this challenge really prevented us from doing that. We thought the safety and privacy of the individuals who are part of this dataset dictated that. And it was important to maintain that safety and privacy of those individuals.
MARIE GARCIA: Great. Thank you.
ERIC MARTIN: And, Joel, if I can just jump in and add more, and I think that's a very good comment. And one way to look at it is given, you know, the imprecision of that location data due to the need to protect privacy which everyone agrees with, you know, if it does make a difference, this is all empirical work. You know, NIJ is interested in seeing what can enhance recidivism forecast so if it does make a difference then, you know, to your point when new models actually get deployed in the field, then those could be further refined and further built in. You know, this is an iterative research process, you know, a part being a fun challenge that we can all participate in and see the results. Thanks.
MARIE GARCIA: All right. Thank you, Eric. Next question. "Of the 25,000 individuals in the datasets, how many or what percent are female?"
JOEL HUNT: Something like 3,100 or 3,200 were female. So, about one in eight, so about 10-12%, I think.
MARIE GARCIA: Okay. There's a follow up comment on the arrest metric question that was asked previously. In order to make time for their many questions, I'll just briefly say this, that "The use of arrest as a metric of recidivism assumes opposite judgment in arresting someone indicate that that person has committed another crime. Interactions with police that resulted in arrests do not equal crime, how do participants take into account that police officers may be biased in a way that they choose to arrest?"
JOEL HUNT: So being a police in person, this is finally a question that's a little bit more my wheelhouse to an extent. I think research agrees with you, that there is that potential bias there. And that's why we do give the training dataset. That potential bias should also be seen in the training dataset just as likely as it is in the test dataset. So while that's nothing we can remove from the challenge we hope that by providing that, the ability to train your models on individuals who would have experienced that same potential bias, that you should be able to account for it in your models. It's the best that we…we'll be ever be able to do in that sense.
MARIE GARCIA: Okay.
ERIC MARTIN: And if we're looking to improve fairness, you know, obvious risk assessment, this is the data that's being used, you know. Like Joel said, we acknowledge these biases in one of the stated goals of the challenges to try to overcome them, you know, move with the data science, move the field forward. So it's a very comment.
JOEL HUNT: And I think what's interesting is that even knowing that there is this bias, we also know that some models have still managed to create more false positive predictions for the racial category of Black. And we're trying to do better than that, so hopefully by giving you a training dataset that has that bias in it, and working on a test dataset that should have that same bias in it, hopefully we can get individuals to create algorithms that the algorithm itself is less biased.
MARIE GARCIA: Great. Thank you, Joel. Next question is about the data that's available. "Can NIJ confirm that the new chart is being observed during the first year while someone is on probation, or is this after they have completed probation?"
JOEL HUNT: So I couldn't hear the first part for some reason, sorry.
MARIE GARCIA: Okay. I'll repeat. Can you please confirm that the new charge is being looked at during the first year while someone is on probation, or is it after they have completed probation?
JOEL HUNT: So we know when they were released and for how long they were released under supervision. So we are able to say whether or not that charge was while they were under supervision. So we can say whether or not that it was a new charge while under supervision.
MARIE GARCIA: Okay. The next…
JOEL HUNT: I could point--so I should clarify that. It is still--so even if someone was only under supervision for a year, we still do follow them for the full three years. So I should clarify that. So even if they're under supervision for only one year and they still committed a crime in year two while they were no longer under supervision, that would still show up as someone who recidivated within our challenge. And we know that that is a slightly different metric in context than what a community correction officer would look at. However, it still talks about the overall discussion we have about individuals returning to the criminal justice system.
MARIE GARCIA: Okay, great. The next question is "The data contains supervision risk score, and it seems this score determines the supervision level. Can you provide information on how the supervision risk score was created?"
JOEL HUNT: I can give some level of comment and then someone else can jump in. I know that the State of Georgia does have a metric tool that they use when they're determining who is to receive that. And that metric is based on their actions while they were within the facility, their criminal history, their plans from when they return to reenter, things like that. I do not know the exact specifics of it. But as the person who posed the question said, it does help determine what level of supervision they're initially under but please note that this is only their initial score. If someone else wants to expand.
ERIC MARTIN: Uh-hmm. Yeah. There's information on the risk assessment used by the state of Georgia that is publicly available as far as the actual name of the risk assessment. And I am not familiar particularly with Georgia but just know that many agencies allow practitioners to do overwrites for just the risk assessment score. And again, I'm not saying that this applies to this dataset at all, but just, you know, keep that in mind. But there is information on the type of risk assessment used that was created in-house with a research partner that the State of Georgia joined with. So it's not, to my knowledge, one that you would purchase off the shelf that are, you know, one of the commonly one known available, like [INDISTINCT] or COMPAS or whatnot. But again you can find that information available. I don't know how well it drills down into, if at all, how those metrics were derived but at least you can get some of the basics of it. I hope that answers your question.
MARIE GARCIA: Thank you, Eric. So just to let everyone know, we're past the two o’clock hour, but we have a few more questions, and NIJ is willing to stay on the phone and answer them for you, so we'll keep going. We have a question, "Are state agencies eligible to apply for the challenge?"
JOEL HUNT: So this is another one where I'm going to do the top out answer of please refer to the eligibility criteria in the challenge. If you still have questions, please email it to the challenge forecasting email address, and I will work with the lawyers to get a specific answer for you.
MARIE GARCIA: Okay. "For the submission data, can applicants include recidivism class membership instead of recidivism class probability?"
ERIC MARTIN: As I've said earlier, we ask the probability score and the ID or what's required for this mission, but Joel can correct me if I'm wrong. In the winner's paper, if, you know, participant was selected, that may be something to discuss about recidivism class memberships and how those were derived.
MARIE GARCIA: Okay. Thank you, Eric. Next question. "Do the data delivered from Georgia include all White and Black [INDISTINCT] during that time period or only a portion of the total releases?"
JOEL HUNT: So we receive from them all individuals who were released from prison under the Department of Community Supervisions. We reduce the dataset down to just individuals that were Black and white but it is all individuals who were under supervision at the time of their release.
MARIE GARCIA: Okay. Next question. That may have been already been answered, but perhaps was asking again. "How can individuals be linked to other datasets such as education attainment, participation, and programs, et cetera?"
JOEL HUNT: So we do provide information about the individual, about their education, and some of the--a few other variables about the individual. And the only other way to really link data is by looking at whom are they proposed to be released in. And we realize that just because that's where they're proposed to be released does not mean that that is where they were when they initially went into prison. So we realized that there are some issues with this. However, we are still hopeful that individuals will find variables to geographically match up to these individuals that will help their models.
MARIE GARCIA: Okay. Thank you. There's two questions that are--I'm going to ask just the one about missing data. "There are some measures in the training data, for example gang affiliations that had missing information. Do you encourage us to use imputation techniques? Or would you prefer us to admit those at community corrections to typically not be able to impute data?" And, again, this was asked more than a few times so I'm just going to ask you this one.
JOEL HUNT: So, yeah, so data imputation on missing data, we leave that open to you. We can tell you that currently, the community corrections officers that we know of do not currently impute that data, that they work with what they have. However, that might be something interesting for us to learn. Does it help to impute it? Does it not help us to impute it? We don't know the answer, and so it's up to you how you would like to handle it.
MARIE GARCIA: Joel, to follow up to that question, "For the fields that have missing data, should the research report include how missing data--missing data were handled?
JOEL HUNT: I would say yes, please.
MARIE GARCIA: Okay. Okay. The next question is about the submissions file label, "Should it have four columns? Specifically the ID, the team name, forecast, the team name, the second year, and the team name in the third year? How would you like the submission files saved and provided to NIJ?"
JOEL HUNT: So where there's talk about team name underscore one-year forecast, that should be the name of the file. So it should be Joel's_one-year-forecast and then when I click on that, for me, it would probably be Excel. When I click on that, I should open it up and there should be only two variables inside, ID and probability. Then when the year two one rolls around, I'll create a new file called Joel's_two-year-forecast and inside of that just two variables, ID and probability, same thing with year three if I decide to do it all three.
MARIE GARCIA: Okay. The next question is about the submission. "If a team is to submit a forecast for overall three-year period, they must--do they have to submit forecast fully for year one, two, and three? Is that correct? For example if a team skipped the year one submission, would they not be considered for the three-year overall because those results were already released? So what is the timing here for the submission?"
JOEL HUNT: Yeah. So let me clarify this a little. When--for this first period, you're just say providing a probability that that individual recidivated during their first year. The second dataset will remove those individuals who really did, who actually did recidivate. So then you're only providing a probability for those who did not recidivate in year one, what's the probability that they recidivated in year two. After year two, we will remove the individuals who recidivated during year two. So now you only have individuals left that did not recidivate in year one or year two and were asking you what's the probability that they did in year three. So you can enter in at any time but really you have to be using the correct datasets. So when you go to apply for the year three forecast, you need to make sure you're looking at just those individuals who did not recidivate in year one or year two, right? Look at the test dataset three that will come out in about a month and a half from now.
MARIE GARCIA: Okay. And we have one question about the datasets, specifically "Will the dataset include information about the records of the arresting officer? For example complaints against the officer."
JOEL HUNT: No. We don't even have information as to who the arresting officer was. We use the Georgia--the GCIC just purely to look up arrest history of these individuals. We did not get any other data, just what was the arrest history prior to this incarceration and subsequent to their release. So we know what the charge or charges were, but not anything about where the charges were or who the arresting officer was or anything like that.
MARIE GARCIA: Great. Thank you, Joel. And one last statement. And a participant on the webinar suggests that given all of the questions that have been asked about the submission format, perhaps NIJ should provide a submission example file for each period so that it's clear to the applicant when they provide this information to us. So, just a question for NIJ to consider.
JOEL HUNT: So I can put one on there but really the file is just two variables, ID and probability. The rest is just the naming convention and making sure you're using the right dataset, right? Use test dataset one for period one, use test dataset two for period two, test dataset three for period three. And all that's inside is ID and probability.
MARIE GARCIA: Okay. So that is the final comment on the Q&A section here. Again for any--I believe we answered all of the questions, but if you have any questions that have come from this webinar and this Q&A section, please send your [INDISTINCT] questions to the NIJ Challenge email address. And, please, again contact us if you have any questions or concerns about the challenge and we'll be happy to provide responses to you as quickly as we're able to. So I will, unless there's any more questions, which I do not see in the box, I will turn it back over to Joel.
JOEL HUNT: So I would just like to thank all of you for attending today. I think at one point I saw over 230 people in attendance and that's after this will only be marketed for a few days. I mean, the challenge was only released last Friday so we think it's great to see so much interest in such an important topic. And we hope that we receive a whole lot of submissions from you all. And hopefully we can get some great entries and continue this discussion and try to keep moving forward on trying to do better in this area. So thank you, everyone. I'd like to thank my colleagues for taking the time and effort over the last two years for putting this together, and for all of you for considering to participate. Thank you, everyone.
DARYL FOX: Okay. So on behalf of the National Institute of Justice and our panelists, thank you for joining today's webinar. This will end today's presentation.