AI is a rapidly advancing field of computer science. In the mid-1950s, John McCarthy, who has been credited as the father of AI, defined it as “the science and engineering of making intelligent machines."[1] Conceptually, AI is the ability of a machine to perceive and respond to its environment independently and perform tasks that would typically require human intelligence and decision-making processes, but without direct human intervention. One facet of human intelligence is the ability to learn from experience. Machine learning is an application of AI that mimics this ability and enables machines and their software to learn from experience.[2] Particularly important from the criminal justice perspective is pattern recognition. Humans are efficient at recognizing patterns and, through experience, we learn to differentiate objects, people, complex human emotions, information, and conditions on a daily basis. AI seeks to replicate this human capability in software algorithms and computer hardware. For example, self-learning algorithms use data sets to understand how to identify people based on their images, complete intricate computational and robotics tasks, understand purchasing habits and patterns online, detect medical conditions from complex radiological scans, and make stock market predictions.
[note 1] “What is Artificial Intelligence,” The Society for the Study of Artificial Intelligence and Simulation of Behaviour.
[note 2] Bernard Marr, “What Is the Difference Between Deep Learning, Machine Learning and AI?” Forbes (December 8, 2016).
Just like humans, learning is a matter of classification and patterns. AI is said to learn through supervised, unsupervised, and semi supervised and reinforcement learning. In supervised learning, AI algorithms are trained by using large numbers of labeled examples. Unsupervised AI algorithms strive to identify patterns in data, looking for similarities that can be used to categorize the data without the aid of labels. Semisupervised learning uses a small amount of labeled data to learn to classify a larger set of unlabeled data. Approach is useful when extracting features from data is difficult, and labeling examples is a time-intensive task for experts. Reinforcement learning trains an algorithm with a reward system, providing feedback when an artificial intelligence agent performs the best action in a particular situation. In reinforcement learning, the system attempts to is going through a process of trial and error until it arrives at the best possible outcome to find the optimal way to complete a particular goal, or improve performance on a specific task.
Adapted from What is AI? Everything you need to know about Artificial Intelligence and SuperVize Me: What’s the Difference Between Supervised, Unsupervised, Semi-Supervised and Reinforcement Learning?
[
AI is all about patterns and thus classification. A tool uses in statistical classification, a confusion matrix, also known as an error matrix has become widely used to tune and assess performance.
A confusion matrix that describes the performance of a classification algorithm (or “classifier”) on a sets of test data where the actual values are known. It promotes visualization of the performance of an algorithm. By thinking of the following values as “dials” on a virtual system, we can adjust the AI algorithm to provide us the required measurements to accomplish our end-result. These measurements include: “true positive” for correctly predicted event values; “false positive” for incorrectly predicted event values. “true negative” for correctly predicted no-event values; “false negative” for incorrectly predicted no-event values; “recall” providing a sense about when it’s actually the right positive answers, and how often the AI algorithm predicts it (high precision high true positives); and “precision: that tells it us how often the AI algorithm predicts as true positives as true positives (high precision low false positives). So when we think of accuracy we can think in terms of recall and precision: high recall, low precision, meaning most positive examples are correctly recognized) but lots of false positives. For Low recall, high precision most positive examples are missed, but those predicted are indeed positive (low FP).
Adapted from Confusion Matrix in Machine Learning and What is a Confusion Matrix in Machine Learning.
Machine learning out of the early days of algorithmic approaches including decision tree learning, inductive logic programming, classification bounding, random forests, Bayesian networks, ensembles to name a few. All take advantage of are human-predefined features and are extremely useful but don’t meet the real definition of what AI wants to achieve as their features are defined by people and not self-discovered.
Adapted from What’s the Difference Between Artificial Intelligence, Machine Learning and Deep Learning?
Deep learning are layers of software simulated neural network algorithms that perform pattern classification and recognition. Imagine a deep learning face detection algorithm. A single-layer neural network represents only linearly separable functions; like two classes in a classification problem that can be separated by a line. Most problems are complex and non-linear functions are needed. The deep learning network algorithm would consist of multiple layers each containing hundreds to thousands of software neurons. The first layer, neuron learns to detect for one basic shape, such as a curve or line. In the next layer, each neuron examines the first layer, and then learns to see if the shapes the first layers detect more advanced shapes, such as a corner or a circle. The next layer neurons then detect if more advanced patterns are present, like a that could indicate a human eye. Then in the final layer, each neuron learns more advanced shapes, including two eyes, a nose and mouth. The final layer then estimates the probability that the image contains a face. During each guess/pass through the data, the deep learning algorithm guesstimates the type, in this case shape of information each neuron in each layer should be looking for. It back propagates updates for each guess based on how well guessed. This process is repeated until, eventually it “learns” what types of information to look for, until it estimates and what you are looking for in the way of performance (say confusion matrix values) are met. In our case detecting faces.
Deep learning is flexible and can as opposed to machine learning be applied to a plethora of domains. Deep learning is moving towards the true definition of AI.
Adapted from What Is Machine Learning?, What’s the Difference Between Artificial Intelligence, Machine Learning and Deep Learning?, and Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations.
Machine learning enables machines to learn by themselves using the provided data and make accurate predictions and decisions. Training in machine learning entails giving large amounts of data to the machine learning algorithm allowing it to refine and understand the processed information. Deep Learning are inspired by the information processing patterns of neural networks in the human brain. Machine learning to identifies patterns and classifies. While deep learning automatically discovers the features to be used for classification, machine learning requires these features to be provided by people. Deep learning also requires a great multiple General Processing Units (GPS), high end computing power and typically large amounts of training data to “learn” and then deliver accurate results. Note that is many systems incorporate the best of machine learning and deep learning with traditional mathematics to get the best of all worlds.
Adapted from Clearing the Confusion: AI vs Machine Learning vs Deep Learning Differences and What deep learning is and isn’t.
Neural networks suffer from what can be termed “continuous amnesia”; every time they retrain on another data set they lose the knowledge they gained in previous iterations on previous data sets. Put simply, neural networks effectively solve scores of tasks when trained from scratch (stationary data sets) and continually sample from all tasks many times until training has converged. However, neural networks struggle when trained incrementally/dynamically. If time dependence is related as it is in the real world (incremental continuous learning) they “forget” increment to increment. This is termed “catastrophic forgetting”. Neural networks quickly unlearn prior knowledge without constant repetition to reinforce the training."[1]
A second problem is termed “the stability-plasticity dilemma,” “There is friction between incremental, parallel learning and plasticity. Too much plasticity encoded data is constantly forgotten; too much stability impedes the efficient coding of this data in the connections between neurons. Neural networks require just the right balance of forgetting and stability.[2]
[note 1] IBM’s Quest to Solve the Continual Learning Problem and Build Neural Networks Without Amnesia
[note 2] A Berkeley mash-up of AI approaches promises continuous learning
Deep learning is better-suited when it is difficult to engineer new features as neural nets discover their own features. “In other words, it discovers new features itself.” However, make sure you have the data to train it. This reduces the overall expected out-of-sample error, making the increased number of parameters manageable … “you just have to feed the beast.” There’s still no assurance that deep learning will outperform some machine learning algorithms on the same task — or the other way around. You must test multiple solutions.[1]
[note 1] Save a Neural Net, Use a Linear Model
Following are a number of ways artificial intelligence may improve public safety:
- Video and image analysis. Research supported by NIJ is helping to lead the way in applying artificial intelligence to address criminal justice needs public safety video and image analysis to identify individuals through obscured face detection and recognition, object of interest, and their actions in videos and images relating to criminal activity or public safety.
- Forensic DNA. DNA analysis involving novel machine learning-based methods of mixture deconvolution is being developed to separate out individual’s DNA samples evidence from violent crimes such as sexual assaults and homicide cold cases.
- Detecting Gunshots. Gunshot detection algorithms are being researched that can differentiate muzzle blasts from shock waves, determine shot-to-shot timings, determine the number of firearms present, assign specific shots to firearms, and estimate probabilities of class and calibers; crime forecasting.
- Crime Forecasting. Crime forecasting approaches are being explored to increase the speed and quality of statutory interpretation for courts, develop warrant service triage tools, identify of financial exploitation and other forms of elder abuse, identify classification of neurocognitive and socio-emotional developmental factors that lead to violence, and determining potential high-risk individuals for perpetration and victimization related to violent crimes.
- Detecting Contraband. Approaches to implement detection and tracking of contraband/unauthorized wireless devices in correctional facilities that localize and fingerprint the device and provide an understanding of contraband wireless devices activities and gather intelligence are being developed.
- Infrastructure Protection. Efficient protection strategies for the infrastructures against terrorism are being explored to develop game-theoretic models to capture the strategic interaction among different players in the critical infrastructures and to develop efficient protection strategies and enhance the security of such infrastructures against intentional attacks.
- Human Trafficking. Tools to combat child trafficking are being advanced that analyze dark net review sites through machine learning and network science to identify posts concerning child victims and use the social network of reviewers to uncover the customers who exploit those children.
- Drug Trafficking. Frameworks are being explored to target opioid traffickers and purchasers on the dark net and link participants in dark net to surface net to provide timely investigative leads to law enforcement in the U.S.
- Improving Community Supervision. Technologies are being researched to support community supervision agencies and persons convicted of an offense in ensuring successful reentry into the community. Situationally-dependent, real-time updates to an individual’s Risk-need-responsivity (RNR) assessment, intelligent tracking, and mobile service delivery offers both corrections practitioners and convicted persons increased access (potentially through mobile devices) to interventions, personalized resources and opportunities as part of a wrap-around intervention strategy.