Spain has become reliant on an algorithm to score how likely a domestic violence victim may be abused again and what protection to provide — sometimes leading to fatal consequences.
The article mentions that one woman (Stefany González Escarraman) went for a restraining order the day after the system deemed her at “low risk” and the judge denied it referring to the VioGen score.
One was Stefany González Escarraman, a 26-year-old living near Seville. In 2016, she went to the police after her husband punched her in the face and choked her. He threw objects at her, including a kitchen ladle that hit their 3-year-old child. After police interviewed Ms. Escarraman for about five hours, VioGén determined she had a negligible risk of being abused again.
The next day, Ms. Escarraman, who had a swollen black eye, went to court for a restraining order against her husband. Judges can serve as a check on the VioGén system, with the ability to intervene in cases and provide protective measures. In Ms. Escarraman’s case, the judge denied a restraining order, citing VioGén’s risk score and her husband’s lack of criminal history.
About a month later, Ms. Escarraman was stabbed by her husband multiple times in the heart in front of their children.
It also says:
Spanish police are trained to overrule VioGén’s recommendations depending on the evidence, but accept the risk scores about 95 percent of the time, officials said. Judges can also use the results when considering requests for restraining orders and other protective measures.
You could argue that the problem isn’t so much the algorithm itself as it is the level of reliance upon it. The algorithm isn’t unproblematic though. The fact that it just spits out a simple score: “negligible”, “low”, “medium”, “high”, “extreme” is, IMO, an indicator that someone’s trying to conflate far too many factors into a single dimension. I have a really hard time believing that anyone knowledgeable in criminal psychology and/or domestic abuse would agree that 35 yes or no questions would be anywhere near sufficient to evaluate the risk of repeated abuse. (I know nothing about domestic abuse or criminal psychology, so I could be completely wrong.)
Apart from that, I also find this highly problematic:
[The] victims interviewed by The Times rarely knew about the role the algorithm played in their cases. The government also has not released comprehensive data about the system’s effectiveness and has refused to make the algorithm available for outside audit.
The article mentions that one woman (Stefany González Escarraman) went for a restraining order the day after the system deemed her at “low risk” and the judge denied it referring to the VioGen score.
The judge should be in jail for that and If the judge thinks the “system” can do his job then he should quit as he is clearly useless.
i could say a lot in response to your comment about the benefits and shortcomings of algorithms (or put another way, screening tools or assessments), but i’m tired.
i am exceedingly troubled that something which is commonly regarded as indicating very high risk when working with victims of domestic violence was ignored in the cited case (disclaimer - i haven’t read the article). if the algorithm fails to consider history of strangulation, it’s garbage. if the user of the algorithm did not include that information (and it was disclosed to them), or keyed it incorrectly, they made an egregious error or omission.
i suppose, without getting into it, i would add - 35 questions (ie established statistical risk factors) is a good amount. large categories are fine. no screening tool is totally accurate, because we can’t predict the future or have total and complete understanding of complex situations. tools are only useful to people trained to use them and with accurate data and inputs. screening tools and algorithms must find a balance between accurate capture and avoiding false positives.
The article mentions that one woman (Stefany González Escarraman) went for a restraining order the day after the system deemed her at “low risk” and the judge denied it referring to the VioGen score.
It also says:
You could argue that the problem isn’t so much the algorithm itself as it is the level of reliance upon it. The algorithm isn’t unproblematic though. The fact that it just spits out a simple score: “negligible”, “low”, “medium”, “high”, “extreme” is, IMO, an indicator that someone’s trying to conflate far too many factors into a single dimension. I have a really hard time believing that anyone knowledgeable in criminal psychology and/or domestic abuse would agree that 35 yes or no questions would be anywhere near sufficient to evaluate the risk of repeated abuse. (I know nothing about domestic abuse or criminal psychology, so I could be completely wrong.)
Apart from that, I also find this highly problematic:
The judge should be in jail for that and If the judge thinks the “system” can do his job then he should quit as he is clearly useless.
From those quotes looks like Idiocracy.
i could say a lot in response to your comment about the benefits and shortcomings of algorithms (or put another way, screening tools or assessments), but i’m tired.
i will just point out this, for anyone reading.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2573025/
i am exceedingly troubled that something which is commonly regarded as indicating very high risk when working with victims of domestic violence was ignored in the cited case (disclaimer - i haven’t read the article). if the algorithm fails to consider history of strangulation, it’s garbage. if the user of the algorithm did not include that information (and it was disclosed to them), or keyed it incorrectly, they made an egregious error or omission.
i suppose, without getting into it, i would add - 35 questions (ie established statistical risk factors) is a good amount. large categories are fine. no screening tool is totally accurate, because we can’t predict the future or have total and complete understanding of complex situations. tools are only useful to people trained to use them and with accurate data and inputs. screening tools and algorithms must find a balance between accurate capture and avoiding false positives.