How the NSA Can Use Metadata to Predict Your Personality
How the NSA Can Use Metadata to Predict Your Personality
By Patrick Tucker March 28, 2014
The president and congressional leaders want to end NSA
bulk metadata collection, but not the use of metadata, which may even be
expanded. From a technical perspective, the question of what your metadata can
reveal about you, or potential enemies, remains as important as it was since
the Edward Snowden scandal. The answer is more than you might think.
First, the background. On Thursday, the Obama
administration released a brief statement on ending the collection of metadata
and limiting, slightly, the circumstances under which metadata could be accessed.
The timing was in keeping with a self-imposed deadline to create legislation to
address NSA bulk collection. The statement said “the government will not
collect these telephone records in bulk; rather, the records would remain at
the telephone companies for the length of time they currently do today.”
Two leaders of the House Intelligence Committee, Reps.
Michael Rogers, R-Mich., and Dutch Ruppersberger, D-Md., are also putting
forward a proposal, called the “End Bulk Collection Act,” which would likewise
seek to switch the collection of bulk metadata collection from the NSA to phone
companies.
The companies would be required to keep the data no
longer than 18 months, as opposed to the 5 years it is currently held by NSA.
But the House bill would also increase the circumstances under which the
government could access metadata, from probable cause to the far more nebulous
“reasonable articulable suspicion.”
In a USA Today op-ed from last July, Ruppersberger argued
that the practice of collecting metadata was benign. But is it?
“The phone-records tool is not some wildly intrusive
surveillance program. In reality, what we are talking about is collection of
‘metadata,’ not content. No names, no addresses and absolutely no
conversations,” he wrote.
Recent research shows that the sort of metadata the NSA
uses in its investigations is actually highly personal.
A group of researchers from the MIT Media Lab found that
your metadata — including, but not limited to, the way in which you use your
phone, how you make calls, to whom, for how long, etc. — can serve as an
indicator of your personality.
Here’s how they figured it out. The researchers,
Yves-Alexandre de Montjoye, Jordi Quoidbach, Florent Robic and Sandy Pentland,
had 100 students fill out surveys to determine their personality along five
distinct personality types:
Neurotic: Defined roughly as a higher than normal
tendency to experience unpleasant emotions
Open: Defined as broadly curious and creative
Extroverted: As in, looks toward others for stimulation
Agreeable: As in warm, compassionate, and cooperative
Conscientiousness: Self-disciplined organized and eager
for success
These types are in keeping with the so-called Five Factor
Model of Personality, a widely used method for describing personality traits.
Once the researchers had the survey data to show how each of the subjects fell
along the spectrum, they examined the subjects’ phone records between March
2010 and June 2011, well within the new 18-month window. Specifically they
looked at these metadata elements:
Basic phone use including the number of calls
Active user behaviors, as in the number of calls
initiated, and the time it took the subject to answer a text
Location, or how far the subject moved, the number of
places from which calls have been made, and other indicators of so-called
radius of gyration
Regularity of calling routine
Diversity, defined as the ratio between the subject’s
total number of contacts and the relative frequency at which he or she
interacts with them
Once the researchers had values for these behaviors they
ran the result through a machine-learning algorithm to determine how each one
refers to personality type. De Montjoye is careful to point out that there
isn’t a one-to-one matchup between a specific observed behavior and a specific
personality. So if your radius of gyration, for instance, is particularly
large, that doesn’t serve as a clear indicator of neuroticism. Rather it’s the
combination of behaviors and the strength of the data available that allows the
model to come up with predictions.
“We let the algorithm determine the right mix,” he said.
“Each indicator is useful but is conditional on all the other indicators. That
doesn’t mean each one is causal or that
people who travel more are neurotic. Let’s say that the relationships between A
and B are not linear, if you do a linear progression you see no relationship;
you do a quadratic progression, you do see how A can predict B.”
The model, in other words, can’t tell you which behavior
to change to make your personality less predictable.
Here’s what it can do: predict personality type much
better than random guessing. When they looked at how the model’s guesses for
each subject’s personality (as revealed by the survey) compared to random
assumptions, they found that the model performed much better at predicting all
of the personality types, about 42 percent on average but as high as 63
percent.
The paper was published in the Proceedings of the 6th
International Conference on Social Computing, Behavioral-Cultural Modeling and
Prediction.
“We see a lot of comments along the lines of ‘It’s only
metadata. It’s not personal. And it only gets personal when a human looks into
it,’said de Montjoye. “We wanted to show an example at a small scale of what
you might be able to do” with that data on how long calls last, when they are
made, and where.
“At the end of the day, the vast majority of the use of
this data is extremely positive,” said de Montjoye, citing the utility of
metadata in city planning, emergency response and other -+areas. He said he
wanted to help researchers and the public develop a better “understanding of
what can be done as well as the limits of privacy. This is really why we do
this.”
From a national security perspective, the use of metadata
remains a powerful tool for finding links between people, including potential
enemies. However, despite the reassurances of Ruppersberger, President Barack
Obama and others that the data isn’t “personal,” it lends itself easily to
creating windows into private lives.
Comments
Post a Comment