A study published Monday says your Facebook "likes" reveal
information about a user including how old you are, how you vote,
if you have a high or low IQ, if you are an introvert or if you are
gay, even if you use drugs.
The study was a collaboration of the Psychometrics Lab at the
University of Cambridge and Microsoft Research Cambridge.
The research shows that intimate personal attributes can be
predicted with high levels of accuracy from 'traces' left by
seemingly innocuous digital behaviour, in this case Facebook Likes.
Study raises important questions about personalised marketing and
In the study, researchers describe Facebook Likes as a "generic
class" of digital record - similar to web search queries and
browsing histories - and suggest that such techniques could be used
to extract sensitive information for almost anyone regularly
Researchers at Cambridge?s Psychometrics Centre, in collaboration
with Microsoft Research Cambridge, analysed a dataset of over
58,000 US Facebook users, who volunteered their Likes, demographic
profiles and psychometric testing results through the myPersonality
Users opted in to provide data and gave consent to have profile
information recorded for analysis. Facebook Likes were fed into
algorithms and corroborated with information from profiles and
Researchers created statistical models able to predict personal
details using Facebook Likes alone. Models proved 88% accurate for
determining male sexuality, 95% accurate distinguishing
African-American from Caucasian American and 85% accurate
differentiating Republican from Democrat. Christians and Muslims
were correctly classified in 82% of cases, and good prediction
accuracy was achieved for relationship status and substance abuse -
between 65 and 73%.
But few users clicked Likes explicitly revealing these attributes.
For example, less that 5% of gay users clicked obvious Likes such
as Gay Marriage. Accurate predictions relied on 'inference' -
aggregating huge amounts of less informative but more popular Likes
such as music and TV shows to produce incisive personal profiles.
Even seemingly opaque personal details such as whether users?
parents separated before the user reached the age of 21 were
accurate to 60%, enough to make the information "worthwhile for
advertisers", suggest the researchers.
While they highlight the potential for personalised marketing to
improve online services using predictive models, the researchers
also warn of the threats posed to users' privacy. They argue that
many online consumers might feel such levels of digital exposure
exceed acceptable limits - as corporations, governments, and even
individuals could use predictive software to accurately infer
highly sensitive information from Facebook Likes and other digital
The researchers also tested for personality traits including
intelligence, emotional stability, openness and extraversion. While
such latent traits are far more difficult to gauge, the accuracy of
the analysis was striking. Study of the openness trait - the
spectrum of those who dislike change to those who welcome it -
revealed that observation of Likes alone is roughly as informative
as using an individual?s actual personality test score.
Some Likes had a strong but seemingly incongruous or random link
with a personal attribute, such as Curly Fries with high IQ, or
That Spider is More Scared Than U Are with non-smokers.
When taken as a whole, researchers believe that the varying
estimations of personal attributes and personality traits gleaned
from Facebook Like analysis alone can form surprisingly accurate
personal portraits of potentially millions of users worldwide.
They say the results suggest a possible revolution in psychological
assessment which - based on this research - could be carried out on
an unprecedented scale without costly assessment centres and
"We believe that our results, while based on Facebook Likes, apply
to a wider range of online behaviours." said Michal Kosinski,
Operations Director at the Psychometric Centre, who conducted the
research with his Cambridge colleague David Stillwell and Thore
Graepel from Microsoft Research.
"Similar predictions could be made from all manner of digital data,
with this kind of secondary ?inference? made with remarkable
accuracy - statistically predicting sensitive information people
might not want revealed. Given the variety of digital traces people
leave behind, it's becoming increasingly difficult for individuals
"I am a great fan and active user of new amazing technologies,
including Facebook. I appreciate automated book recommendations, or
Facebook selecting the most relevant stories for my newsfeed," said
Kosinski. "However, I can imagine situations in which the same data
and technology is used to predict political views or sexual
orientation, posing threats to freedom or even life."
Thore Graepel from Microsoft Research said he hoped the research
would contribute to the on-going discussions about user privacy:
"Consumers rightly expect strong privacy protection to be built
into the products and services they use and this research may well
serve as a reminder for consumers to take a careful approach to
sharing information online, utilising privacy controls and never
sharing content with unfamiliar parties."