Based on the opening, I’m expecting some great examples many which are going to be as heavily biased as things like redlining seen in lending practices in the last century. They’ll come about as the result of missing data, missing assumptions, and even incorrect assumptions.
I’m aware that one of the biggest problems in so-called Big Data is that one needs to spend an inordinate amount of time cleaning up the data (often by hand) to get something even remotely usable. Even with this done I’ve heard about people not testing out their data and then relying on the results only to later find ridiculous error rates (sometimes over 100%!)
Of course there is some space here for the intelligent mathematician, scientist, or quant to create alternate models to take advantage of overlays in such areas, and particularly markets. By overlay here, I mean the gambling definition of the word in which the odds of a particular wager are higher than they should be, thus tending to favor an individual player (who typically has more knowledge or information about the game) rather than the house, which usually relies on a statistically biased game or by taking a rake off of the top of a parimutuel financial structure, or the bulk of other players who aren’t aware of the inequity. The mathematical models based on big data (aka Weapons of Math Destruction or WMDs) described here, particularly in financial markets, are going to often create such large inequities that users of alternate means can take tremendous advantage of the differences for their own benefits. Perhaps it’s the evolutionary competition that will more actively drive these differences to zero? If this is the case, it’s likely that it’s going to be a long time before they equilibrate based on current usage, especially when these algorithms are so opaque.
I suspect that some of this book will highlight uses of statistical errors and logical fallacies like cherry picking data, but which are hidden behind much more opaque mathematical algorithms thereby making them even harder to detect than simple policy decisions which use the simpler form. It’s this type of opacity that has caused major market shifts like the 2008 economic crash, which is still heavily unregulated to protect the masses.
I suspect that folks within Bryan Alexander’s book club will find that the example of Sarah Wysocki to be very compelling and damning evidence of how these big data algorithms work (or don’t work, as the case may be.) In this particular example, there are so many signals which are not only difficult to measure, if at all, that the thing they’re attempting to measure is so swamped with noise as to be unusable. Equally interesting, but not presented here, would be the alternate case of someone tremendously incompetent (perhaps who is cheating as indicated in the example) who actually scored tremendously high on the scale who was kept in their job.
Highlights, Quotes, & Marginalia
Do you see the paradox? An algorithm processes a slew of statistics and comes up with a probability that a certain person might be a bad hire, a risky borrower, a terrorist, or a miserable teacher. That probability is distilled into a score, which can turn someone’s life upside down. And yet when the person fights back, “suggestive” countervailing evidence simply won’t cut it. The case must be ironclad. The human victims of WMDs, we’ll see time and again, are held to a far higher standard of evidence than the algorithms themselves.
Highlight (yellow) – Introduction > Location xxxx
Added on Sunday, October 9, 2017
[WMDs are] opaque, unquestioned, and unaccountable, and they operate at a scale to sort, target or “optimize” millions of people. By confusing their findings with on-the-ground reality, most of them create pernicious WMD feedback loops.
Highlight (yellow) – Introduction > Location xxxx
Added on Sunday, October 9, 2017
The software is doing it’s job. The trouble is that profits end up serving as a stand-in, or proxy, for truth. We’ll see this dangerous confusion crop up again and again.
Highlight (yellow) – Introduction > Location xxxx
Added on Sunday, October 9, 2017
I’m reading this as part of Bryan Alexander’s online book club.
Chris, this is excellent. Your data science background gives you a solid professional background to apply to Weapons. I’m looking forward to learning more.