The algorithms that govern Artificial Intelligence applied to justice are not as reliable as one might think
Artificial Intelligence (AI) applied to the judicial system, that is, the work of the algorithms that predict the chances of reoffending criminals and their economic bail to remain free, is not a reliable technique, according to the report prepared by «Partnership on AI”.
This is a technology industry consortium focused on establishing best practices for AI systems and informing and educating the public about this new technique that is invading our lives by leaps and bounds.
The U.S. attempts to reduce the prison population through computer predictions of criminal behavior are based on misunderstanding and could reinforce existing prejudices, artificial intelligence experts who produced the Partnership on AI report concluded.
As a result, algorithm-based tools “should not be used alone to make stop or go decisions.”
Although the program is so far implemented in some states in the United States, the UK police forces also have a strong interest in computerized risk assessment tools based on artificial intelligence.
Despite the fact that the final objective is positive since it is about eliminating prejudices in the evaluations when it comes to estimating whether or not a prisoner is going to escape during his parole pending trial, it is nevertheless questioned whether the Algorithms are the right instrument for the job.
“It is a serious misunderstanding to think that tools are objective or neutral simply because they are data-driven,” the report says.
“While statistical formulas and models provide some degree of consistency and replicability, they still share or amplify many weaknesses of human decision making.”
The “ecological fallacy”
A basic problem is the “ecological fallacy,” also known as the “divisive ambiguity fallacy,” a misinterpretation of statistical data that occurs when data collected about a group is applied so that the system can predict the behavior of a group. concrete individual: what we commonly know as stereotypes that affect race, religion, class, etc.
The report also establishes a series of requirements that jurisdictions must evaluate before adopting an algorithmic decision, including the application of methods to measure and mitigate errors.
However, even if all the requirements are met, experts still question whether it is acceptable to make decisions about a person’s release based on data about other individuals.
“Using risk assessment tools to make fair decisions about human freedom would require resolving profound ethical, technical, and statistical challenges, including ensuring tools are designed and built to mitigate errors and find appropriate protocols.”
Currently, available tools must first meet the basic requirements recommended by the report and should not be used otherwise.
The report’s requirements “represent a minimum standard for policy developers and makers trying to align their risk assessment tools, and how they are used in practice, with well-founded policy goals.”
The organization that authored the study, Partnership of AI, is made up of more than 80 North American groups and companies, including Accenture, ACLU, Berkman Klein Center at Harvard University, Google, IBM, Samsung, and Sony, among others.
Minority report come true
In “Minority Report”, the film directed by Steven Spielberg and based on a science fiction story by Philip K. Dick, crime is almost completely eliminated thanks to the visions of three mutants who predict the crimes that are going to be committed, which allows criminals to be arrested before committing a crime.
But in reality, that a machine decides on human life always gives rise to controversies.
Researchers and civil rights advocates have questioned its reliability, fearing that its results are unfair.
There is a paradigmatic case, that of a Wisconsin man, Eric L. Loomis, who was sentenced to six years in prison in part for a report generated by the secret algorithm of private software, which the defense could not inspect or question.
It was the COMPAS program, sold by Northpointe Inc, which sentenced Loomis to “a high risk of violence, a high risk of recidivism, a high risk before trial,” according to the prosecutor, conclusions with which the judge agreed. agreement.
But it didn’t take long for dissenting voices to rise up, uncomfortable with the use of a secret algorithm that had sent a man to jail.
A report came to light from a team of researchers led by Julia Dressel, a computer science student at Dartmouth College, who after analyzing the validity of the COMPAS results, found that the algorithm it uses is no more reliable than any untrained human.
And there were also suspicions that the system favored whites over blacks.
Dressel relied on the ProPublica database, a collection of COMPAS scores for 10,000 defendants awaiting trial in Broward County, Florida, as well as their arrest records for the next two years.
The researcher randomly selected 1,000 of the defendants and recorded seven pieces of information on each, including their age, gender, and the number of previous arrests.
It then recruited 400 untrained research volunteers of varying ages and educational levels, who were given profiles of 50 defendants to predict whether they would be re-arrested within two years, the same standard used by COMPAS.
Humans got it right almost as often as the algorithm, between 63% and 67% of the time, compared to 65% for COMPAS, according to the journal “Science Advances.”
It turns out that humans score just as well as the algorithm when they correctly predict someone’s arrest. But when they are wrong, they reveal a similar racial bias.
Both man and machine incorrectly assumed that there would be more black arrests than actually happened (false positives) and fewer white arrests (false negatives).
It is difficult to defend against these algorithms without having access to them since the company that markets COMPAS says that its formula is a trade secret.
This and other products with similar algorithms are used to set bail and even assist in determinations of guilt or innocence, but the inner workings of these tools are inaccessible to the defendant, attorneys, prosecutors, and judges.
Although Loomis challenged his conviction because he was not allowed to evaluate the algorithm, the State Supreme Court ruled against him, reasoning that knowledge of the algorithm’s results was a sufficient level of transparency.
But jurists and experts believe that without adequate guarantees, these tools run the risk of eroding the rule of law and diminishing individual rights.
Echoing Kranzberg’s first law of technology, these algorithms are neither good nor bad, but they are certainly not neutral.
Experts warn that accepting AI into the legal system without a prior implementation and evaluation plan is blindly allowing the technology to advance.
For this reason, it would be most appropriate to create a moratorium on the use of opaque AI in criminal justice risk assessment, at least until there are processes and procedures in place that allow a meaningful examination of these tools and prevent algorithms from failing. controlled bypass of the criminal justice system.
Starting by ensuring that the data on which the algorithms are trained is representative of the entire population, it is important that the Government establish regulatory standards to guarantee the impartiality and transparency of the algorithm, through the periodic publication of information that can be audited by organizations. like civil rights groups.
The European predictions
Meanwhile, Europe has not gone that far, but a group of scientists from the University of Sheffield (United Kingdom) and the University of Pennsylvania (United States) has developed a system that predicts the outcome of judicial decisions with great precision since 79% of European Court of Human Rights (ECHR) verdicts were correctly anticipated through text analysis using a machine learning algorithm.
To develop this system, the team of researchers, led by Dr. Aletras, realized that the judgments of the European Court of Human Rights are correlated with non-legal facts rather than directly legal arguments.
This means that they are more realistic than formalistic, with results that are consistent with those of other studies carried out in high-level courts, such as the Supreme Court of the United States.
The investigation was based on 584 cases related to articles 3, 6, and 8 of the European Convention on Human Rights to apply its artificial intelligence algorithm in search of patterns in the text.
They found that the most reliable factors in predicting the court’s decision are the language used and the issues and circumstances mentioned in the case.
With this information, the system was able to achieve a hit accuracy of 79%.