Weapons of Math Destruction, Cathy O'Neil
“The first question: Even if the participant is aware of being modeled, or what the model is used for, is the model opaque, or even invisible?... A key component of this suffering is the pernicious feedback loop. As we’ve seen, sentencing models that profile a person by his or her circumstances help to create the environment that justifies their assumptions. This destructive loop goes round and round, and in the process the model becomes more and more unfair.The third question is whether a model has the capacity to grow exponentially. As a statistician would put it, can it scale? This might sound like the nerdy quibble of a mathematician. But scale is what turns WMDs from local nuisances into tsunami forces, ones that define and delimit our lives. As we’ll see, the developing WMDs in human resources, health, and banking, just to name a few, are quickly establishing broad norms that exert upon us something very close to the power of law....
So to sum up, these are the three elements of a WMD: Opacity, Scale, and Damage”
“Shell Shocked: My Journey of Disillusionment
My challenge was to design an algorithm that would distinguish window shoppers from buyers. There were a few obvious signals. Were they logged into the service? Had they bought there before? But I also scoured for other hints. What time of day was it, and what day of the year? Certain weeks are hot for buyers. The Memorial Day “bump,” for example, occurs in mid-spring, when large numbers of people make summer plans almost in unison. My algorithm would place a higher value on shoppers during these periods, since they were more likely to buy. The statistical work, as it turned out, was highly transferable from the hedge fund to e-commerce—the biggest difference was that, rather than the movement of markets, I was now predicting people’s clicks. In fact, I saw all kinds of parallels between finance and Big Data. Both industries gobble up the same pool of talent, much of it from elite universities like MIT, Princeton, or Stanford. These new hires are ravenous for success and have been focused on external metrics—like SAT scores and college admissions—their entire lives. Whether in finance or tech, the message they’ve received is that they will be rich, that they will run the world. Their productivity indicates that they’re on the right track, and it translates into dollars. This leads to the fallacious conclusion that whatever they’re doing to bring in more money is good. It “adds value.” Otherwise, why would the market reward it? In both cultures, wealth is no longer a means to get by. It becomes directly tied to personal worth. A young suburbanite with every advantage—the prep school education, the exhaustive coaching for college admissions tests, the overseas semester in Paris or Shanghai—still flatters himself that it is his skill, hard work, and prodigious problem-solving abilities that have lifted him into a world of privilege. Money vindicates all doubts. And the rest of his circle plays along, forming a mutual admiration society. They’re eager to convince us all that Darwinism is at work, when it looks very much to the outside like a combination of gaming a system and dumb luck. In both of these industries, the real world, with all of its messiness, sits apart. The inclination is to replace people with data trails, turning them into more effective shoppers, voters, or workers to optimize some objective. This is easy to do, and to justify, when success comes back as an anonymous score and when the people affected remain every bit as abstract as the numbers dancing across the screen. I was already blogging as I worked in data science, and I was also getting more involved with the Occupy movement. More and more, I worried about the separation between technical models and real people, and about the moral repercussions of that separation. In fact, I saw the same pattern emerging that I’d witnessed in finance: a false sense of security was leading to widespread use of imperfect models, self-serving definitions of success, and growing feedback loops. Those who objected were regarded as nostalgic Luddites. I wondered what the analogue to the credit crisis might be in Big Data. Instead of a bust, I saw a growing dystopia, with inequality rising. The algorithms would make sure that those deemed losers would remain that way. A lucky minority would gain ever more control over the data economy, raking in outrageous fortunes and convincing themselves all the while that they deserved it. After a couple of years working and learning in the Big Data space, my journey to disillusionment was more or less complete, and the misuse of mathematics was accelerating. In spite of blogging almost daily, I could barely keep up with all the ways I was hearing of people being manipulated, controlled, and intimidated by algorithms. It started with teachers I knew struggling under the yoke of the value-added model, but it didn’t end there. Truly alarmed, I quit my job to investigate the issue in earnest.”
On perverse incentives caused by WMDs.
“Students in the Chinese city of Zhongxiang had a reputation for acing the national standardized test, or gaokao, and winning places in China’s top universities. They did so well, in fact, that authorities began to suspect they were cheating. Suspicions grew in 2012, according to a report in Britain’s Telegraph, when provincial authorities found ninety-nine identical copies of a single test. The next year, as students in Zhongxiang arrived to take the exam, they were dismayed to be funneled through metal detectors and forced to relinquish their mobile phones. Some surrendered tiny transmitters disguised as pencil erasers. Once inside, the students found themselves accompanied by fifty-four investigators from different school districts. A few of these investigators crossed the street to a hotel, where they found groups positioned to communicate with the students through their transmitters. The response to this crackdown on cheating was volcanic. Some two thousand stone-throwing protesters gathered in the street outside the school. They chanted, “We want fairness. There is no fairness if you don’t let us cheat.” It sounds like a joke, but they were absolutely serious. The stakes for the students were sky high. As they saw it, they faced a chance either to pursue an elite education and a prosperous career or to stay stuck in their provincial city, a relative backwater. And whether or not it was the case, they had the perception that others were cheating. So preventing the students in Zhongxiang from cheating was unfair. In a system in which cheating is the norm, following the rules amounts to a handicap...
Each college’s admissions model is derived, at least in part, from the U.S. News model, and each one is a mini-WMD. These models lead students and their parents to run in frantic circles and spend obscene amounts of money. And they’re opaque. This leaves most of the participants (or victims) in the dark. But it creates a big business for consultants, like Steven Ma, who manage to learn their secrets, either by cultivating sources at the universities or by reverse-engineering their algorithms. The victims, of course, are the vast majority of Americans, the poor and middle-class families who don’t have thousands of dollars to spent on courses and consultants. They miss out on precious insider knowledge. The result is an education system that favors the privileged. It tilts against needy students, locking out the great majority of them—and pushing them down a path toward poverty. It deepens the social divide. But even those who claw their way into a top college lose out. If you think about it, the college admissions game, while lucrative for some, has virtually no educational value. The complex and fraught production simply re-sorts and reranks the very same pool of eighteen-year-old kids in newfangled ways. They don’t master important skills by jumping through many more hoops or writing meticulously targeted college essays under the watchful eye of professional tutors. Others scrounge online for cut-rate versions of those tutors. All of them, from the rich to the working class, are simply being trained to fit into an enormous machine—to satisfy a WMD. And at the end of the ordeal, many of them will be saddled with debt that will take decades to pay off. They’re pawns in an arms race, and it’s a particularly nasty one.”
On opaque ranking systems that boil universities down to ordinal rankings without explicitly describing the variables used to compare them.
“Perhaps it was just as well that the Obama administration failed to come up with a rejiggered ranking system. The pushback by college presidents was fierce. After all, they had spent decades optimizing themselves to satisfy the U.S. News WMD. A new formula based on graduation rates, class size, alumni employment and income, and other metrics could wreak havoc with their ranking and reputation. No doubt they also made good points about the vulnerabilities of any new model and the new feedback loops it would generate. So the government capitulated. And the result might be better. Instead of a ranking, the Education Department released loads of data on a website. The result is that students can ask their own questions about the things that matter to them—including class size, graduation rates, and the average debt held by graduating students. They don’t need to know anything about statistics or the weighting of variables. The software itself, much like an online travel site, creates individual models for each person. Think of it: transparent, controlled by the user, and personal. You might call it the opposite of a WMD.“
“Defenders of the tests note that they feature lots of questions and that no single answer can disqualify an applicant. Certain patterns of answers, however, can and do disqualify them. And we do not know what those patterns are. We’re not told what the tests are looking for. The process is entirely opaque. What’s worse, after the model is calibrated by technical experts, it receives precious little feedback. Again, sports provide a good contrast here. Most professional basketball teams employ data geeks, who run models that analyze players by a series of metrics, including foot speed, vertical leap, free-throw percentage, and a host of other variables. When the draft comes, the Los Angeles Lakers might pass on a hotshot point guard from Duke because his assist statistics are low. Point guards have to be good passers. Yet in the following season they’re dismayed to see that the rejected player goes on to win Rookie of the Year for the Utah Jazz and leads the league in assists. In such a case, the Lakers can return to their model to see what they got wrong. Maybe his college team was relying on him to score, which punished his assist numbers. Or perhaps he learned something important about passing in Utah. Whatever the case, they can work to improve their model. Now imagine that Kyle Behm, after getting red-lighted at Kroger, goes on to land a job at McDonald’s. He turns into a stellar employee. He’s managing the kitchen within four months and the entire franchise a year later. Will anyone at Kroger go back to the personality test and investigate how they could have gotten it so wrong? Not a chance, I’d say. The difference is this: Basketball teams are managing individuals, each one potentially worth millions of dollars. Their analytics engines are crucial to their competitive advantage, and they are hungry for data. Without constant feedback, their systems grow outdated and dumb. The companies hiring minimum-wage workers, by contrast, are managing herds. They slash expenses by replacing human resources professionals with machines, and those machines filter large populations into more manageable groups. Unless something goes haywire in the workforce—an outbreak of kleptomania, say, or plummeting productivity—the company has little reason to tweak the filtering model. It’s doing its job—even if it misses out on potential stars. The company may be satisfied with the status quo, but the victims of its automatic systems suffer. And as you might expect, I consider personality tests in hiring departments to be WMDs. They check all the boxes. First, they are in widespread use and have enormous impact. The Kronos exam, with all of its flaws, is scaled across much of the hiring economy. Under the previous status quo, employers no doubt had biases. But those biases varied from company to company, which might have cracked open a door somewhere for people like Kyle Behm. That’s increasingly untrue. And Kyle was, in some sense, lucky. Job candidates, especially those applying for minimum-wage work, get rejected all the time and rarely find out why. It was just chance that Kyle’s friend happened to hear about the reason for his rejection and told him about it. Even then, the case against the big Kronos users would likely have gone nowhere if Kyle’s father hadn’t been a lawyer, one with enough time and money to mount a broad legal challenge. This is rarely the case for low-level job applicants. * Finally, consider the feedback loop that the Kronos personality test engenders. Red-lighting people with certain mental health issues prevents them from having a normal job and leading a normal life, further isolating them. This is exactly what the Americans with Disabilities Act is supposed to prevent.
The majority of job applicants, thankfully, are not blackballed by automatic systems. But they still face the challenge of moving their application to the top of the pile and landing an interview...The hiring market, clearly, was still poisoned by prejudice...As you might expect, human resources departments rely on automatic systems to winnow down piles of résumés. In fact, some 72 percent of résumés are never seen by human eyes. Computer programs flip through them, pulling out the skills and experiences that the employer is looking for. Then they score each résumé as a match for the job opening. It’s up to the people in the human resources department to decide where the cutoff is, but the more candidates they can eliminate with this first screening, the fewer human-hours they’ll have to spend processing the top matches. So job applicants must craft their résumés with that automatic reader in mind. It’s important, for example, to sprinkle the résumé liberally with words the specific job opening is looking for. This could include positions (sales manager, chief financial officer, software architect), languages (Mandarin, Java), or honors (summa cum laude, Eagle Scout). Those with the latest information learn what machines appreciate and what tangles them up... The result of these programs, much as with college admissions, is that those with the money and resources to prepare their résumés come out on top. Those who don’t take these steps may never know that they’re sending their résumés into a black hole. It’s one more example in which the wealthy and informed get the edge and the poor are more likely to lose out.”