2.6: Algorithms in the Criminal Justice System

This is excerpted from: Anna Maria Barry-Jester, Ben Casselman, & Dana Goldstein (2015), “The New Science of Sentencing,” The Marshall Project. You can find the full article here.

Criminal sentencing has long been based on the present crime and, sometimes, the defendant’s past criminal record. In Pennsylvania, judges could soon consider a new dimension: the future.

Pennsylvania is on the verge of becoming one of the first states in the country to base criminal sentences not only on what crimes people have been convicted of, but also on whether they are deemed likely to commit additional crimes. As early as next year, judges there could receive statistically derived tools known as risk assessments to help them decide how much prison time — if any — to assign.

Risk assessments have existed in various forms for a century, but over the past two decades, they have spread through the American justice system, driven by advances in social science. The tools try to predict recidivism — repeat offending or breaking the rules of probation or parole — using statistical probabilities based on factors such as age, employment history and prior criminal record. They are now used at some stage of the criminal justice process in nearly every state. Many court systems use the tools to guide decisions about which prisoners to release on parole, for example, and risk assessments are becoming increasingly popular as a way to help set bail for inmates awaiting trial.

But Pennsylvania is about to take a step most states have until now resisted for adult defendants: using risk assessment in sentencing itself. A state commission is putting the finishing touches on a plan that, if implemented as expected, could allow some offenders considered low risk to get shorter prison sentences than they would otherwise or avoid incarceration entirely. Those deemed high risk could spend more time behind bars.

Pennsylvania, which already uses risk assessment in other phases of its criminal justice system, is considering the approach in sentencing because it is struggling with an unwieldy and expensive corrections system. Pennsylvania has roughly 50,000 people in state custody, 2,000 more than it has permanent beds for. Thousands more are in local jails, and hundreds of thousands are on probation or parole. The state spends $2 billion a year on its corrections system — more than 7 percent of the total state budget, up from less than 2 percent 30 years ago. Yet recidivism rates remain high: 1 in 3 inmates is arrested again or reincarcerated within a year of being released.

States across the country are facing similar problems — Pennsylvania’s incarceration rate is almost exactly the national average — and many policymakers see risk assessment as an attractive solution. Moreover, the approach has bipartisan appeal: Among some conservatives, risk assessment appeals to the desire to spend tax dollars on locking up only those criminals who are truly dangerous to society. And some liberals hope a data-driven justice system will be less punitive overall and correct for the personal, often subconscious biases of police, judges and probation officers. In theory, using risk assessment tools could lead to both less incarceration and less crime.

There are more than 60 risk assessment tools in use across the U.S., and they vary widely. But in their simplest form, they are questionnaires — typically filled out by a jail staff member, probation officer or psychologist — that assign points to offenders based on anything from demographic factors to family background to criminal history. The resulting scores are based on statistical probabilities derived from previous offenders’ behavior. A low score designates an offender as “low risk” and could result in lower bail, less prison time or less restrictive probation or parole terms; a high score can lead to tougher sentences or tighter monitoring.

The risk assessment trend is controversial. Critics have raised numerous questions: Is it fair to make decisions in an individual case based on what similar offenders have done in the past? Is it acceptable to use characteristics that might be associated with race or socioeconomic status, such as the criminal record of a person’s parents? And even if states can resolve such philosophical questions, there are also practical ones: What to do about unreliable data? Which of the many available tools — some of them licensed by for-profit companies — should policymakers choose?

Even some supporters of risk assessment in bail and parole worry that using the tools for sentencing carries echoes of “Minority Report”: locking people up for crimes they might commit in the future. In a speech to the National Association of Criminal Defense Lawyers last August, then-Attorney General Eric Holder said risk assessment tools can be useful in directing offenders toward rehabilitative programs, allowing them to shorten their prison sentences. But he criticized the use of such tools at the sentencing phase. “By basing sentencing decisions on static factors and immutable characteristics — like the defendant’s education level, socioeconomic background, or neighborhood — they may exacerbate unwarranted and unjust disparities that are already far too common in our criminal justice system and in our society,” he said.


Sonja Starr, a University of Michigan law professor who has been a leading opponent of risk assessment, says it isn’t fair. “These instruments aren’t about getting judges to individually analyze life circumstances of a defendant and their particular risk,” she said. “It’s entirely based on statistical generalizations.”

Supporters of the tools counter that judges, parole boards and other decision-makers already make their own risk assessments, whether or not they call them that. The difference is that people aren’t as good as statistics at predicting who is most likely to commit crimes in the future. In the 1960s, before the current burst of research on risk, one common misconception among correctional experts was that people with mental illness were more likely to be repeat, violent offenders. They aren’t, research shows. Formal risk assessments offer greater transparency and, according to numerous studies, greater accuracy than the ad hoc systems they are replacing. Yet in most cases, the tools’ recommendations are only advisory. Judges can — and do — choose to disregard their suggestions for many reasons, including because they prefer exercising professional discretion or because they feel the tool fails to account for an important aspect of the defendant or his or her crime.

Using a questionnaire “doesn’t guarantee a probation officer won’t give a kid a higher risk score because he thinks the kid wears his pants too low,” said Adam Gelb, director of the public safety performance project at the Pew Charitable Trusts. But, he said, risk assessment creates a record of how officials are making decisions. “A supervisor can question, ‘Why are we recommending that this kid with a minor record get locked up?’ Anything that’s on paper is more transparent than the system we had in the past. In many cases, you had no idea from probation officer to probation officer, let alone from judge to judge, what was in people’s heads. There was no transparency, and decisions could be based on just about any bias or prejudice.”


At their most basic, risk assessment tools are all built in essentially the same way: Social scientists look at a large population of former prisoners, examine hundreds of facts about their lives, and then follow the individuals over several years to see which traits are associated with further criminal activity. Criminologists have identified various factors that appear linked to continued criminal activity, such as feeling proud of breaking the law or having marital or substance abuse problems. But from a raw statistical standpoint, three factors are far and away the most predictive: sex, age and prior criminal history.

There is little question that well-designed risk assessment tools “work,” in that they predict behavior better than unaided expert opinion. Over the past several decades, dozens of social scientific studies have been published comparing professional predictions of risk to predictions made by statistics. When implemented correctly, whether in the fields of medicine, finance or criminal justice, statistical actuarial tools are accurate at predicting human behavior — about 10 percent more accurate than experts assessing without the assistance of such a tool, according to a 2000 paper by a team of psychologists at the University of Minnesota.

But to critics, just because a trait predicts crime doesn’t mean it’s fair to use it in sentencing decisions. Pennsylvania’s proposed tool will take into account factors like sex and age that are beyond an individual’s control. It will also include a question on where offenders live and, in some cases, penalize residents of urban areas, who are far more likely to be black.

Perhaps most controversially, the Sentencing Commission’s draft assessment tool will factor in an individual’s history of arrests, not just convictions. Even using convictions is potentially problematic; blacks are more likely than whites to be convicted of marijuana possession, for example, even though they use the drug at rates equivalent to whites. But arrests are even more racially skewed than convictions, and public defender groups in Pennsylvania think their use to determine sentencing may be unconstitutional.

Bradley Bridge, an attorney with the Defender Association of Philadelphia, points to differences in policing around the state, which he says can have a dramatic effect on arrests. Heavy policing in some neighborhoods in Philadelphia makes low-income and nonwhite residents more likely to be arrested, whether or not they’ve committed more or worse crimes.

“This is a compounding problem,” Bridge said. “Once they’ve been arrested once, they are more likely to be arrested a second or a third time — not because they’ve necessarily done anything more than anyone else has, but because they’ve been arrested once or twice beforehand.”

Even many people who defend risk assessment in theory say it can be problematic in practice. Official records can contain mistakes. Tools intended for one purpose can be used for another. Many tools include questions that are subjective, requiring that the person filling out the questionnaire characterize the offender’s feelings and attitudes. That process can introduce error.


[Virginia’s Criminal Sentencing] commission has released aggregate figures showing that incarceration and recidivism rates in the state have both fallen since it began using risk assessment in sentencing. But in a long fight with the Daily Press, a newspaper in southeastern Virginia, the commission has refused to release more detailed data that could reveal the policy’s impact on racial and other disparities. Rob Poggenklass, an ACLU attorney, said the lack of data makes it difficult to evaluate the true impact of the state’s risk assessment policy. “It’s sort of tricky to make any big pronouncements about whether it’s working,” he said.

Indeed, it has proved remarkably difficult to evaluate the real-world impact of risk assessment, positive or negative. As in Virginia, states have often released only limited data, and even where they have been more forthcoming, the latest generation of risk assessment tools is still too new for conclusions to be drawn about their long-term effects. Randomized experiments like the one conducted in Philadelphia’s probation system are all but impossible in sentencing and are generally rare in criminal justice. And, as in Pennsylvania, risk assessment tools are often adopted as part of larger criminal justice reforms, making it hard to isolate their effects.

The studies that have tried to overcome such challenges have shown mixed, though generally positive, results. In the juvenile justice system, the nonprofit Annie E. Casey Foundation has encouraged hundreds of jurisdictions to adopt risk assessment tools as part of a broader package of reforms intended to reduce the number of incarcerated youth. Across all its partner sites, Casey reports a 46 percent reduction in the detention of youth of color and a 44 percent reduction in the detention of white youth, although those results cannot be attributed to risk assessment alone.

In the adult system, the results have generally been more modest. Kentucky adopted a risk assessment tool developed by the nonprofit Arnold Foundation for use in its bail decisions. It has released more defendants and re-arrested fewer of them, but the change has been far from dramatic. The percentage of defendants the state released pretrial went up 2 points in the six months after the tool was introduced, Arnold reports, while the rate of new arrests for defendants awaiting trial declined to 8.5 percent from 10 percent. (The Arnold Foundation is a funder of The Marshall Project.)

Determining other impacts of risk assessment is even harder. Jennifer Skeem, a University of California, Berkeley, psychologist who has written extensively on risk assessment, said there simply isn’t enough data available to say with certainty whether it reduces racial disparities in the justice system. But she said better data alone won’t be enough to resolve the questions the tools raise.

“I’m not convinced that when we do have the evidence, that it’s going to shut down the debate” because there will still be more fundamental questions, she said.

The core questions around risk assessment aren’t about data. They are about what the goals of criminal justice reforms should be. Some supporters see reducing incarceration as the primary goal; others want to focus on reducing recidivism; still others want to eliminate racial disparities. Risk assessments have drawn widespread support in part because, as long as they remain in the realm of the theoretical, they can accomplish all those goals. But once they enter the real world, there are usually trade-offs.

Risk assessment tools can determine that a person like Milton Fosque has a 49 percent chance of committing another crime. What they can’t decide is what to do with that information. Should 49 percent be considered high risk or low? Should Fosque be in prison? On probation? In treatment? Berk, the University of Pennsylvania statistician, said those decisions have to be made by policymakers and the public, not researchers.

“I’m not trying to design interventions that turn bad guys into good guys,” Berk said. “My job is to provide, I hope, better information to inform whatever decisions are being made.”

In Pennsylvania, at least, such policy discussions have drawn little public attention despite the best efforts of the Sentencing Commission, which in addition to publishing its detailed reports has held public hearings across the state. Those hearings drew so few people that Bergstrom, the commission’s executive director, extended the public comment period through the end of the year.

Bergstrom, who has run the commission for nearly two decades, is walking a delicate line. He said he wants to create a tool that accurately predicts behavior while avoiding endless lawsuits. The commission’s research has found that prior arrests are a better predictor of recidivism than prior convictions. But using arrests would almost certainly draw a constitutional challenge from the state’s public defenders. They point to the racial disparities in arrest rates and say it’s illegal to presume someone is guilty just because he was arrested.

Based on the work the commission has done so far, Bergstrom says he’s leaning toward using the tool to identify outliers — low-risk individuals to defer from prison altogether and high-risk individuals to flag for extra time or treatment. That would be a fairly limited approach, but it wouldn’t avoid the central question of whether offenders should spend more time behind bars simply because of how statistical tools say they will behave in the future.


Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

The Primacy of the Public by Marcus Schultz-Bergin is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.