22.5.17

HILLARY CLINTON'S FAT TAILS.

Calm down, I'm not referring to her entourage.


Rather, I'm referring to a post-mortem on the failure of the Smartest Kids in the Room to see the wave election that hit them.
The core of Clinton campaign strategy was their analytics system, developed by dozens of researchers who were led by Clinton’s director of analytics, Elan Kriegel, in close consultation with campaign manager Robby Mook. In the Washington Post, John Wagner wrote, “the algorithm was said to play a role in virtually every strategic decision Clinton aides made, including where and when to deploy the candidate and her battalion of surrogates and where to air television ads—as well as when it was safe to stay dark.” The oracle of the system was “Ada,” a big-data simulator that issued up-to-the-minute probabilities on Clinton’s chances by state and county. Throughout the general election, Ada backed her arguments for a decisive Clinton win in the Electoral College with a ton of stats. But Ada, and all her numbers, turned out to be wrong.
We're not talking about garbage in, garbage out here. Rather, we're talking about a black swan event that ought to have at least been in the analysts' minds.
Ada ran “400,000 simulations a day of what the race against Trump might look like.” This is a very “big data” sort of claim. 400,000 is rather large—no human could look through the results of that many simulations. Ada’s “intelligence” lay in how she boiled down the results of those 400,000 simulations into a campaign strategy. Each of Ada’s electoral simulations was premised on variations in turnout based around expected margins of error—for example, one simulation might posit that Hispanics would break for Clinton 2 or 3 points higher (or lower) than the data predicted. By sampling a representative subset of all possible variations—the so-called Monte Carlo method of quantitative analysis—Ada would produce a set of outcomes. After such simulations, Ada showed that Michigan and Wisconsin went for Trump only a small percentage of the time, compared to Florida and Pennsylvania, which went for Trump a larger percentage of the time.

Yet what must have seemed like a foolproof, detailed prescription for victory based on data and computation was mostly a confirmation of preexisting biases—particularly the campaign’s faith in the firewall. In another election year, those biases might have turned out to be right, and Ada would have been mistakenly vindicated. Here, though, the oracle was revealed to be little more than a parrot. Once the initial analysis showed that Clinton was favored to win in certain states, Ada helped prevent the campaign from questioning her conclusions.
That's the same error the people writing credit default swaps made. Then came one day in which a twenty-five-sigma event (under their priors) occurred, followed by another day with another twenty-five-sigma event.  Your Monte Carlo method is only as good as your priors about the variations.  Is your world Gaussian, with two-thirds of the expected events within one standard deviation of the mean?  Or might you be in messy reality, with more than two-thirds of the expected events within one standard deviation, and more than one-twentieth of the expected events beyond two standard deviations?

Dizziness due to success did in the hedgies.  Dizziness due to success did in the hillaries.

1 comment:

David Foster said...

The "Ada" system is no doubt named after Ada, Countess of Lovelace....who despite her creativity and genius apparently lost a lot of money by using mathematical techniques for betting on the horses.

I wonder if the developers & users of the system were aware of that aspect of Ada's biography...