Assessing the Effectiveness of Programs To Prevent and Counter Violent Extremism

Three NIJ-supported evaluation studies offer key insights and recommended practices to examine the effectiveness of initiatives to prevent and counter violent extremism.
National Institute of Justice Journal
Date Published
February 12, 2024

Initiatives designed to prevent and counter violent extremists’ efforts to recruit, radicalize, and mobilize followers and commit acts of violence in the name of a group or ideology are critical.

It is equally important to assess the effectiveness of these initiatives. Initiatives may apply different theoretical and methodological approaches when designing their objectives, activities, and measures of success. Scientifically rigorous evaluation studies make it possible to assess whether these initiatives have been implemented according to their design (i.e., program fidelity)[1] and have met success benchmarks, which helps decision-makers decide whether to sustain, modify, limit, or scale up such efforts.[2]

This is no simple task. It may be difficult to assess an initiative because of a lack of available data, suitable control and comparison groups, and validated metrics against which to assess programs.[3] In general, it is difficult to determine whether a particular strategy or program is the cause or reason an act or event has been prevented. Evaluation efforts also demand human and financial resources.[4] Researchers should address these challenges during program design and implementation to evaluate whether the initiative achieved its intended goals.

Over the past decade, the National Institute of Justice (NIJ) has solicited cross-disciplinary studies to help understand the effectiveness of policies, programs, and initiatives to prevent and counter violent extremism. This article discusses NIJ-supported evaluations of three programs: the World Organization for Resource Development and Education’s countering violent extremism program,[5] the Muslim Public Affairs Council’s Safe Spaces Initiative,[6] and the Peer to Peer Challenging Extremism Initiative. Our goal is to help address long-standing challenges and inform future, locally led and community-based program design and evaluation practices.

WORDE’s Countering Violent Extremism Program

The World Organization for Resource Development and Education (WORDE) was a U.S. community-based, Muslim-led organization in Montgomery County, Maryland.[7] Its countering violent extremism program aimed to create and maintain networks of civically engaged individuals who were sensitized to issues of violent extremism and had proactive, cooperative relationships with local social services and law enforcement agencies.

The program — as evaluated — was composed of primary prevention activities, although it evolved to also include secondary prevention activities.[8] WORDE’s flagship program, Youth Against Hunger, brought together youth and adults from diverse faith and ethnic groups to prepare and deliver food to individuals who were homeless, which fostered inclusivity and honored volunteer community service. The program offered student service learning credits and thus attracted high school students who were required to earn such credits.

WORDE also had a multicultural program called the JustART series, which brought together a culturally diverse group of youth to produce digital artistic works (e.g., short films) on themes of social change. JustART was designed to be creatively empowering, interactive, and collaborative.

Key Findings

The evaluation initially employed a grounded theory approach and discovered (via focus groups) that peers might notice early signs that indicated individuals were considering acts of violent extremism.[9] The researchers recognized the importance of further understanding peers’ (“bystanders”) willingness or reluctance to intervene. This led to the “Theory of Vicarious Help-Seeking,”[10] which asserts that the fear of damaging one’s peer relationships tends to reduce their willingness to intervene in matters concerning violent extremism. However, when fear is greatest or most desperate, peers will tend to intervene. The researchers tested the theory via a survey embedded within the overall data collection; the theory was supported and subsequently replicated within a sample representative of the U.S. population.[11]

This NIJ-supported evaluation[12] also produced an evidence-based inventory of participants’ reasons why they chose to participate in WORDE’s programming — their motivations and what they felt they gained by participating (the “Brief Volunteer Program Outcome Assessment” scale, included in appendix 3 of the evaluation report). For example, participants reported, “I feel a part of something bigger than myself,” “I feel a sense of purpose,” “I feel accepted,” and “I learn about cultures other than my own.”[13] Furthermore, it tested a theoretical model (the “Investment Model”)[14] that predicted an incredibly large 77% of the variance in participants’ commitment to continued involvement in the program.[15]

The evaluation found that, across the two programs referenced above, WORDE had the intended effects on 12 of the 14 outcomes believed to be relevant to countering violent extremism that comprised the “Brief Volunteer Program Outcome Assessment” scale. Exhibit 1 lists the 14 outcomes, along with example references of peer-reviewed literature pertinent to each outcome.

As shown in exhibit 1, on all but two of the 14 outcome measures, participants’ average/mean level of agreement was greater than or equal to “somewhat agree” (i.e., ≥4 on 6-point Likert-type scales ranging from “strongly disagree” to “strongly agree”), to an extent that exceeded the standard deviations for each outcome. Therefore, such responses were reliably above the midpoint of those items (i.e., above “neither agree nor disagree”). Indeed, among those 12 outcomes, only one did not have a median rating of “agree.”

Exhibit 1. Participants’ Self-Reported Outcomes of Participation in WORDE’s Programming

ItemMeanMedianStandard Deviation
I feel welcome[1]5.71(Agree) 61.05
I feel a part of something bigger than myself[2]5.8461.01
I feel a sense of teamwork[3]5.56(Somewhat agree) 5.95
I make friendships that are active beyond the event[4]5.7361.19
I make friends with people from other races[5]*5.4461.08
I feel useful[6]5.8361.06
I have responsibilities[7]5.6561.12
I have leadership responsibilities[8]*5.3551.20
I feel a sense of purpose[9]5.696.97
I feel free of peer pressure[10]5.6761.02
I feel accepted[11]5.786.96
I wouldn’t feel lonely[12]5.7361.17
I wouldn’t feel afraid to talk to others[13]5.6161.01
I learn about cultures other than my own[14]5.7261.18

4 = scale midpoint (“neither agree nor disagree”)
5 = “somewhat agree”
6 = “agree”
* = These items did not reliably exceed the threshold for “neither agree nor disagree.”

These findings made the WORDE program the first research-based countering violent extremism program in the United States.[16] However, the outcomes for participants were not significantly better than the comparison group (those living in the same county who engaged in volunteer service but not with WORDE).[17] In fairness, WORDE represented that its programming was oriented toward enhancing communication and understanding between communities to mitigate social and political conflict.[18] Therefore, for this evaluation, the fair test was whether WORDE’s programmatic outcomes, relevant to those objectives, were reliably produced — not whether they were produced in a superior way relative to other, perhaps similar, types of programming.

This highlights the concept of so-called equifinality; that is, there can be more than one means of achieving a given outcome. In other words, there can be more than one way (i.e., more than one type of programming) to achieve a given programmatic outcome relevant to preventing and countering violent extremism. In conclusion, the researchers recommended testing the generalizability of the outcomes by implementing the program in other municipalities.

Recommended Practices for Evaluation

We can draw several recommended practices for future evaluation from studying the WORDE program:

  • Articulate — and test — underlying program-relevant theory. Such theoretical advancements might be as useful to the field as the findings from the program evaluation.
  • Quasi-experimental[19] methods can — and should — be used in evaluations of initiatives to prevent and counter violent extremism. Previously validated measures should be used whenever possible.
  • Incorporate a mixed-methods approach (e.g., use both quantitative and qualitative methods) whenever possible.  
  • Researchers should publish all measurement instruments (e.g., as annexes). This is a key component of building an evidence-based approach. Such open access to measurement instruments — and their subsequent use — allows researchers to perform meta-analyses of programs.[20] The present project produced a set of 12 freely licensed survey measures (totaling 99 items) that demonstrated excellent measurement reliability and consistency and are available to aid future efforts.[21]

Safe Spaces Initiative

Another NIJ-supported evaluation examined the Muslim Public Affairs Council’s (MPAC)[22] Safe Spaces Initiative, which helps Muslims in the United States implement programming to prevent and counter violent extremism in their communities. The program trained community stakeholders (e.g., religious leaders, community organizers, social workers) on the Safe Spaces toolkit and helped them create community response teams that would perform prevention and intervention activities after initial training. MPAC also provided post-implementation technical assistance.

The original program and model received widespread criticism from the U.S. Muslim community for its policy-driven, top-down, national security framing, which communities associated with self-policing and law enforcement surveillance.[23] Thus, many communities declined to take part in the program. Out of nine recruited sites, only four received the training and executed the prevention activities, and three of those four did not continue prevention programming as prescribed by the training.[24]

Safe Spaces was revised following a three-year, multisite program evaluation conducted with support from an NIJ grant.[25] The newer version of the program that resulted from this evaluation promoted a grassroots approach to building healthy and resilient communities as an effective way of preventing violent extremism through primary, secondary, and tertiary prevention.[26] Despite the shift in framing from violent extremism to public health, concerns over the political climate, stigmatization, and sole focus on Muslim communities prevented successful implementation and, consequently, evaluation.

Key Findings

Safe Spaces failed to address the actual concerns and priorities of the communities, such as mental health, substance use, youth leaving religion, and domestic violence. This failure was largely due to the top-down approach that did not resonate with the local communities. Additionally, the lack of buy-in from community leaders, confusion regarding long-term commitment, and insufficient financial and human resources and partnerships with local organizations caused participation to decline. Finally, Safe Spaces replicated services, structures, and programs that already existed in some communities.[27]

Due to implementation challenges, the researchers modified the evaluation study. Rather than using data-driven methods as planned, researchers instead used follow-up interviews to focus on implementation barriers and recommendations.[28] Still, some elements showed potential and could inform future program design in key areas.

Recommended Practices for Programming

Drawing from the implementation and evaluation experiences of the Safe Spaces Initiative:

  • Engaging with community members and developing partnerships from the start is key to success. Designers should consider the community’s top priorities and concerns as they define program objectives, and they should tailor the content for different communities. Public health framing should permeate every facet of the program, including the language used.[29] Through primary and secondary prevention activities, programs should adopt community-level strategies to mitigate risk and leverage protective factors as well as decrease risky behavior. Instead of ejecting individuals who have committed to extremist causes, programs should rehabilitate and reintegrate them.
  • A true public health framing emphasizes a whole-of-community approach, which may boost the longevity of program outcomes and overall community health through inter-community dialogue with external partners, institutions, and networks. Programming across a range of communities — rather than focusing on a single faith or ethnic community — should be baked into the design, which could also alleviate concerns about profiling and stigmatization.
  • An outside trainer, who is not only intimately familiar with the subject matter and the target community but also adequately trained to effectively address participants’ concerns and questions, should deliver the program.[30]
  • The teams designing and implementing the program should emphasize the value and benefits of a public-health-focused violence prevention program to earn the trust of the communities, increase community support and participation, and mobilize community resources. Additionally, buy-in from local political leadership would give communities greater agency and boost the success of implementing the program.[31]
  • After initial delivery, researchers and program implementation teams should monitor fidelity measures and post-implementation support on an ongoing basis to ensure implementation fidelity.
  • Program success relies, in part, on engaging community leaders and volunteers, being transparent about expectations, and familiarizing the target communities with the training materials in advance, or co-developing them in collaboration with the community.[32] In communities that lack the personnel and capacity to execute and sustain their own violence prevention programming, stakeholders should collaboratively work toward capacity building (e.g., providing guidance for potential external funding and staffing search, considering services needed such as translation). Researchers and program design and implementation teams should prioritize finding expertise within the community, but they should also engage partners outside the immediate community as necessary.
  • Communities should develop an understanding of the different types of resources and commitments needed for primary, secondary, and tertiary prevention activities. It may not be possible to simultaneously implement distinct stages. Communities should not be expected to assess their own needs and reach out, given the resource and personnel challenges involved. Researchers and initial program design and implementation teams should include post-implementation technical assistance in their designs from the outset.

Peer to Peer Challenging Extremism Initiative

The Peer to Peer (P2P) Challenging Extremism Initiative — renamed Invent2Prevent (I2P) in 2021 — encourages college and high school students to develop social campaigns and educational interventions to counter violent extremist rhetoric while emphasizing positive messages about ethnic and cultural diversity. The initiative developed 150 U.S.-based campaigns from 2015 to 2017. The initiative was interrupted for several years and later restarted in 2021. Since 2021, 77 collegiate programs and 27 high school programs have been engaged in more than 100 violent extremism prevention projects.[33]

The NIJ-supported evaluation of the P2P Initiative began with a review of 150 domestic campaigns produced by P2P students. The majority (121 in 150) of these campaigns focused on promoting unity, peace, acceptance, and similar values; 29 of them addressed specific extremist ideologies.[34] The researchers chose to evaluate two campaigns in real time: Kombat with Kindness (Utah) and Operation 250 (Massachusetts).

Key Findings

Prior to conducting the evaluation study, the researchers distributed surveys to a sample of students in the high schools where the two campaigns were taking place to determine the students’ attitudes toward diversity and exposure to online hate, violence, and grooming. Responses from 1,087 students showed that 6 in 10 students had been exposed to hateful messages[35] when online, and 1 in 10 had come across a hateful group on the internet and had someone from that group try to convince them of their views.[36]

Survey data also showed that girls were twice as likely to experience bullying, harassment, insults, exposure to sexual content, and violence online compared with boys. Girls were also more likely than boys to have their photos used inappropriately, receive sexual content, and have a stranger online ask them to meet in person.[37] Finally, the surveys found that students who spent more than three hours online per day had twice the risk of being exposed to hateful messages compared to those who spent less time online.[38]

Kombat with Kindness

The Kombat with Kindness campaign aimed to combat hatred with kindness by promoting acceptance of diversity through video presentations, t-shirts, banners, and social events in middle and high schools. To evaluate the campaign’s effectiveness, the researchers surveyed 143 students at participating schools before and after implementing the initiative. They compared the survey results to a control group of 183 students who attended schools with similar demographic characteristics in the same state that were not involved in the initiative. The findings showed that students who attended the schools where Kombat with Kindness was implemented saw and heard fewer hateful messages on school grounds after the start of the campaign compared to the students in the control group, which suggests that the campaign achieved its intended results. Furthermore, in both the control and experimental groups, students who acquired awareness about institutional and cultural racism over the course of the school year became more accepting of diversity.[39]

Operation 250

The Operation 250 initiative educates youth about the risk of being recruited and exploited by violent extremist groups online. It teaches youth about their own psychological vulnerabilities when interacting with others in the online space and about in-group versus out-group bias. It also raises awareness of preconceived notions about people of a different race, ethnicity, gender, or other identity-shaping characteristic.[40]

To evaluate the impact of the initiative, the researchers randomized high school students into two groups: a group of 67 students who received the Operation 250 training and a control group of 61 students who received training on how to prepare for a snow emergency. The findings of the evaluation showed that students who received the Operation 250 training were 9.6 times more likely than those in the control group to have gained awareness about in-group versus out-group bias. The evaluation showed that the initiative achieved promising but only marginally statistically significant results for its impact on students’ awareness of online risky behaviors.[41]

Recommended Practices for Evaluation and Programming

We can draw two key recommended practices for future programming and evaluation from the P2P Initiative study:

  • Anonymized school surveys are an important tool to assess students’ experiences (such as bullying and exposure to hate messages and groups), their attitudes toward diversity and racism, and the segments of the student population most at risk for becoming victims of bullying, harassment, and online grooming. These data can help better target school-based programs to prevent and counter violent extremism.
  • Initiatives that both enhance youth’s knowledge about cultural and institutional racism and online risks and promote an overall school environment focused on acceptance of diversity are promising practices that create resilience toward violent extremism.


All three programs discussed in this article highlight the importance of evaluation studies. WORDE’s countering violent extremism program and the P2P Initiative show how successful implementation coupled with a well-designed evaluation study that has clear definitions and outcome measures can help inform future efforts. In comparison, Safe Spaces provides a good example of an evaluation study flexible enough to be modified as implementation challenges arise and still produce insight to inform better program design and highlights the importance of initial and ongoing community engagement. Researchers still need to conduct systematic and meta-analytic reviews and to improve their understanding of the measures of effectiveness through comparative and multisite evaluations.

Future programs to prevent and counter violent extremism should include rigorous evaluations. As evidenced by the various methods discussed in this article, researchers have several types of evaluation methodologies — for example, pre- and post-intervention, quasi-experimental designs, post-mortem, and qualitative interviews — that they can consider to conduct an effective assessment. Researchers and practitioners should also identify core components of the programs and intended outcome measures early in the process. These measures should go beyond those that are commonly employed in present studies, which is an individual’s sense of self, level of support for violent extremist groups or activity, or level of support for the use of violence generally.[42] Most importantly, researchers should create a feedback loop that allows practitioners and program implementers to use the evaluation results to improve the program.

About This Article

This article was published as part of NIJ Journal issue number 285. This article discusses the following awards:

Opinions or points of view expressed in this document represent a consensus of the authors and do not necessarily represent the official position, policies, terminology, or posture of the U.S. Department of Justice on domestic violent extremism. The content is not intended to create, does not create, and may not be relied upon to create any rights, substantive or procedural, enforceable at law by any party in any matter civil or criminal.

Date Published: February 12, 2024