The Peculiar Blindness of Experts
The Peculiar Blindness of Experts
Credentialed authorities are comically bad at predicting
the future. But reliable forecasting is possible.
DAVID EPSTEIN JUNE 2019 ISSUE
The bet was on, and it was over the fate of humanity. On
one side was the Stanford biologist Paul R. Ehrlich. In his 1968 best seller,
The Population Bomb, Ehrlich insisted that it was too late to prevent a
doomsday apocalypse resulting from overpopulation. Resource shortages would
cause hundreds of millions of starvation deaths within a decade. It was cold,
hard math: The human population was growing exponentially; the food supply was
not. Ehrlich was an accomplished butterfly specialist. He knew that nature did
not regulate animal populations delicately. Populations exploded, blowing past
the available resources, and then crashed.
In his book, Ehrlich played out hypothetical scenarios
that represented “the kinds of disasters that will occur.” In the worst-case
scenario, famine rages across the planet. Russia, China, and the United States
are dragged into nuclear war, and the resulting environmental degradation soon
extinguishes the human race. In the “cheerful” scenario, population controls
begin. Famine spreads, and countries teeter, but the major death wave ends in
the mid-1980s. Only half a billion or so people die of starvation. “I challenge
you to create one more optimistic,” Ehrlich wrote, adding that he would not
count scenarios involving benevolent aliens bearing care packages.
RIVERHEAD
The economist Julian Simon took up Ehrlich’s challenge.
Technology—water-control techniques, hybridized seeds, management
strategies—had revolutionized agriculture, and global crop yields were
increasing. To Simon, more people meant more good ideas about how to achieve a
sustainable future. So he proposed a wager. Ehrlich could choose five metals
that he expected to become more expensive as resources were depleted and chaos
ensued over the next decade. Both men agreed that commodity prices were a fine
proxy for the effects of population growth, and they set the stakes at $1,000
worth of Ehrlich’s five metals. If, 10 years hence, prices had gone down,
Ehrlich would have to pay the difference in value to Simon. If prices went up,
Simon would be on the hook for the difference. The bet was made official in
1980.
In October 1990, Simon found a check for $576.07 in his
mailbox. Ehrlich got smoked. The price of every one of the metals had declined.
In the 1960s, 50 out of every 100,000 global citizens died annually from
famine; by the 1990s, that number was 2.6.
Ehrlich’s starvation predictions were almost comically
bad. And yet, the very same year he conceded the bet, Ehrlich doubled down in
another book, with another prediction that would prove untrue: Sure, his
timeline had been a little off, he wrote, but “now the population bomb has
detonated.” Despite one erroneous prediction after another, Ehrlich amassed an
enormous following and received prestigious awards. Simon, meanwhile, became a
standard-bearer for scholars who felt that Ehrlich had ignored economic
principles. The kind of excessive regulations Ehrlich advocated, the Simon camp
argued, would quell the very innovation that had delivered humanity from
catastrophe. Both men became luminaries in their respective domains. Both were
mistaken.
When economists later examined metal prices for every
10-year window from 1900 to 2008, during which time the world population
quadrupled, they saw that Ehrlich would have won the bet 62 percent of the
time. The catch: Commodity prices are a poor gauge of population effects, particularly
over a single decade. The variable that both men were certain would vindicate
their worldviews actually had little to do with those views. Prices waxed and
waned with macroeconomic cycles.
Yet both men dug in. Each declared his faith in science
and the undisputed primacy of facts. And each continued to miss the value of
the other’s ideas. Ehrlich was wrong about the apocalypse, but right on aspects
of environmental degradation. Simon was right about the influence of human
ingenuity on food and energy supplies, but wrong in claiming that improvements
in air and water quality validated his theories. Ironically, those improvements
were bolstered through regulations pressed by Ehrlich and others.
Ideally, intellectual sparring partners “hone each
other’s arguments so that they are sharper and better,” the Yale historian Paul
Sabin wrote in The Bet. “The opposite happened with Paul Ehrlich and Julian
Simon.” As each man amassed more information for his own view, each became more
dogmatic, and the inadequacies in his model of the world grew ever more stark.
The pattern is by now familiar. In the 30 years since
Ehrlich sent Simon a check, the track record of expert forecasters—in science,
in economics, in politics—is as dismal as ever. In business, esteemed (and
lavishly compensated) forecasters routinely are wildly wrong in their
predictions of everything from the next stock-market correction to the next
housing boom. Reliable insight into the future is possible, however. It just
requires a style of thinking that’s uncommon among experts who are certain that
their deep knowledge has granted them a special grasp of what is to come.
The idea for the most important study ever conducted of
expert predictions was sparked in 1984, at a meeting of a National Research
Council committee on American-Soviet relations. The psychologist and political
scientist Philip E. Tetlock was 30 years old, by far the most junior committee
member. He listened intently as other members discussed Soviet intentions and
American policies. Renowned experts delivered authoritative predictions, and
Tetlock was struck by how many perfectly contradicted one another and were
impervious to counterarguments.
Tetlock decided to put expert political and economic
predictions to the test. With the Cold War in full swing, he collected
forecasts from 284 highly educated experts who averaged more than 12 years of
experience in their specialties. To ensure that the predictions were concrete,
experts had to give specific probabilities of future events. Tetlock had to
collect enough predictions that he could separate lucky and unlucky streaks
from true skill. The project lasted 20 years, and comprised 82,361 probability
estimates about the future.
The result: The experts were, by and large, horrific
forecasters. Their areas of specialty, years of experience, and (for some)
access to classified information made no difference. They were bad at
short-term forecasting and bad at long-term forecasting. They were bad at
forecasting in every domain. When experts declared that future events were
impossible or nearly impossible, 15 percent of them occurred nonetheless. When
they declared events to be a sure thing, more than one-quarter of them failed
to transpire. As the Danish proverb warns, “It is difficult to make
predictions, especially about the future.”
Even faced with their results, many experts never
admitted systematic flaws in their judgment. When they missed wildly, it was a
near miss; if just one little thing had gone differently, they would have
nailed it. “There is often a curiously inverse relationship,” Tetlock
concluded, “between how well forecasters thought they were doing and how well
they did.”
Early predictions in Tetlock’s research pertained to the
future of the Soviet Union. Some experts (usually liberals) saw Mikhail
Gorbachev as an earnest reformer who would be able to change the Soviet Union
and keep it intact for a while, and other experts (usually conservatives) felt
that the Soviet Union was immune to reform and losing legitimacy. Both sides
were partly right and partly wrong. Gorbachev did bring real reform, opening
the Soviet Union to the world and empowering citizens. But those reforms
unleashed pent-up forces in the republics outside Russia, where the system had
lost legitimacy. The forces blew the Soviet Union apart. Both camps of experts
were blindsided by the swift demise of the U.S.S.R.
One subgroup of scholars, however, did manage to see more
of what was coming. Unlike Ehrlich and Simon, they were not vested in a single
discipline. They took from each argument and integrated apparently
contradictory worldviews. They agreed that Gorbachev was a real reformer and
that the Soviet Union had lost legitimacy outside Russia. A few of those
integrators saw that the end of the Soviet Union was close at hand and that
real reforms would be the catalyst.
The integrators outperformed their colleagues in pretty
much every way, but especially trounced them on long-term predictions.
Eventually, Tetlock bestowed nicknames (borrowed from the philosopher Isaiah
Berlin) on the experts he’d observed: The highly specialized hedgehogs knew
“one big thing,” while the integrator foxes knew “many little things.”
Hedgehogs are deeply and tightly focused. Some have spent
their career studying one problem. Like Ehrlich and Simon, they fashion tidy
theories of how the world works based on observations through the single lens
of their specialty. Foxes, meanwhile, “draw from an eclectic array of
traditions, and accept ambiguity and contradiction,” Tetlock wrote. Where
hedgehogs represent narrowness, foxes embody breadth.
Incredibly, the hedgehogs performed especially poorly on
long-term predictions within their specialty. They got worse as they
accumulated experience and credentials in their field. The more information
they had to work with, the more easily they could fit any story into their
worldview.
Unfortunately, the world’s most prominent specialists are
rarely held accountable for their predictions, so we continue to rely on them
even when their track records make clear that we should not. One study compiled
a decade of annual dollar-to-euro exchange-rate predictions made by 22
international banks: Barclays, Citigroup, JPMorgan Chase, and others. Each
year, every bank predicted the end-of-year exchange rate. The banks missed
every single change of direction in the exchange rate. In six of the 10 years,
the true exchange rate fell outside the entire range of all 22 bank forecasts.
In 2005, tetlock published his results, and they caught
the attention of the Intelligence Advanced Research Projects Activity, or
IARPA, a government organization that supports research on the U.S.
intelligence community’s most difficult challenges. In 2011, IARPA launched a
four-year prediction tournament in which five researcher-led teams competed.
Each team could recruit, train, and experiment however it saw fit. Predictions
were due at 9 a.m. every day. The questions were hard: Will a European Union
member withdraw by a target date? Will the Nikkei close above 9,500?
Tetlock, along with his wife and collaborator, the
psychologist Barbara Mellers, ran a team named the Good Judgment Project.
Rather than recruit decorated experts, they issued an open call for volunteers.
After a simple screening, they invited 3,200 people to start forecasting. Among
those, they identified a small group of the foxiest forecasters—bright people
with extremely wide-ranging interests and unusually expansive reading habits,
but no particular relevant background—and weighted team forecasts toward their
predictions. They destroyed the competition.
Tetlock and Mellers found that not only were the best
forecasters foxy as individuals, but they tended to have qualities that made
them particularly effective collaborators. They were “curious about, well,
really everything,” as one of the top forecasters told me. They crossed
disciplines, and viewed their teammates as sources for learning, rather than
peers to be convinced. When those foxes were later grouped into much smaller
teams—12 members each—they became even more accurate. They outperformed—by a
lot—a group of experienced intelligence analysts with access to classified
data.
One forecast discussion involved a team trying to predict
the highest single-day close for the exchange rate between the Ukrainian
hryvnia and the U.S. dollar during an extremely volatile stretch in 2014. Would
the rate be less than 10 hryvnia to a dollar, between 10 and 13, or more than
13? The discussion started with a team member offering percentages for each
possibility, and sharing an Economist article. Another team member chimed in
with historical data he’d found online, a Bloomberg link, and a bet that the
rate would land between 10 and 13. A third teammate was convinced by the
second’s argument. A fourth shared information about the dire state of
Ukrainian finances, which he feared would devalue the hryvnia. A fifth noted
that the United Nations Security Council was considering sending peacekeepers
to the region, which he believed would buoy the currency.
Two days later, a team member with experience in finance
saw that the hryvnia was strengthening amid events he’d thought would surely
weaken it. He informed his teammates that this was exactly the opposite of what
he’d expected, and that they should take it as a sign of something wrong in his
understanding. (Tetlock told me that, when making an argument, foxes often use
the word however, while hedgehogs favor moreover.) The team members finally
homed in on “between 10 and 13” as the heavy favorite, and they were correct.
In Tetlock’s 20-year study, both the broad foxes and the
narrow hedgehogs were quick to let a successful prediction reinforce their
beliefs. But when an outcome took them by surprise, foxes were much more likely
to adjust their ideas. Hedgehogs barely budged. Some made authoritative
predictions that turned out to be wildly wrong—then updated their theories in
the wrong direction. They became even more convinced of the original beliefs
that had led them astray. The best forecasters, by contrast, view their own
ideas as hypotheses in need of testing. If they make a bet and lose, they
embrace the logic of a loss just as they would the reinforcement of a win. This
is called, in a word, learning.
Comments
Post a Comment