In the 2013 sci-fi romance Her, a writer falls in love with his computer’s operating system. Theodore Twombly, played by Joaquin Phoenix, begins to feel as if his Samantha—the female identity of his OS, voiced by Scarlett Johansson—knows him better than his closest friends do.
After Wu Youyou and Michal Kosinski saw Her, they started discussing whether it was possible for a computer in the real world to judge someone’s personality better than other humans.
Wu, a Ph.D. candidate in social psychology at the University of Cambridge, and Kosinski, a postdoctoral fellow in computer science at Stanford, are co-lead authors of a study, published online Monday in the journal PNAS (the Proceedings of the National Academy of Sciences), that asks how well someone’s digital footprint can predict his or her personality—and how that compares with the judgment of friends, family, roommates or a spouse.
In his previous research, Kosinski found that Facebook likes can be used to accurately predict a range of attributes, including sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age and gender.
He says that one common comment he and his colleagues received about that paper, published in PNAS in March 2013, was “OK, this is really cool that [Facebook likes] can predict all of those things, but how does it compare with how humans can predict?” In other words, he says, how impressive is it that the computer can make such accurate predictions?
Wu and Kosinski, along with David Stillwell, a researcher at Cambridge, found that the computer’s average accuracy predicting personality was higher than the average accuracy of all human judges, except for a spouse. But a spouse’s judgment, too, could be beat with a larger quantity of data.
The researchers used a sample of more than 80,000 volunteers, who answered 100 questions on an app called myPersonality. The questionnaire, based on the commonly used OCEAN model, evaluates respondents for five major personality traits: openness, conscientiousness, extraversion, agreeableness and neuroticism. The participants also gave researchers access to their Facebook likes, which served as the digital footprint used to compare computer judgment with human judgment of personality in the study.
The researchers had the computer try to find patterns that linked the personality traits uncovered in the OCEAN survey with Facebook likes by building linear regressions models on a portion of the sample data they had collected and generating a formula for each trait based on likes.
For example, in evaluating openness, likes such as Buddhism, The Daily Show, Salvador Dali and William Shakespeare were linked with a liberal and artistic personality on the survey, whereas How to Lose a Guy in 10 Days, George W. Bush, Rush Limbaugh, and rap and hip-hop were linked with a conservative and conventional personality. On the extraversion scale, Snookie, partying, Gucci and beer pong indicated an outgoing and active personality, while The Matrix, programming, Doctor Who and thinking indicated a shy and reserved personality.
The researchers then fed in the remaining sample data so the computer could predict a user’s traits based on the patterns it had previously established. They repeated the process to generate predictions for each participant. The more Facebook likes a study participant had, the more accurate the computer’s prediction (at least up to a few hundred likes—Kosinski expects there would be diminishing returns at a certain point).
They then had friends and family members of participants fill out a short survey about the latter group’s personalities, and compared the results with the computer’s judgments.
With just 10 likes, the computer did a better job predicting someone’s personality than a co-worker did; with 70 likes, it beat friends’ and roommates’ judgments; with 150 likes, it superseded that of family members; and with 300 likes, it was even better than a spouse.
It’s important to keep in mind that the study focused on one type of personality assessment related to five specific traits. There are other observations and judgments a computer might not handle as well, says Wu, such as emotional intelligence.
“Humans do have an advantage given their ability to capture cues that…might not be as visible in a digital environment,” she says. “Maybe likes are not indicative of how socially skilled someone is,” whereas humans can determine whether someone is socially awkward or doesn’t have empathy from observations of things like facial expression and body language.
The researchers chose to use likes as their digital footprint “because this is a very generic type of digital signal,” says Kosinski, “curated by people and maintained in a public environment.” He and his colleagues predict that the “results should generalize to other environments like Spotify playlists, web browsing logs, Amazon Kindle logs.”
However, Kosinski says that in previous tests, he has found that other footprints, like Web browsing logs, are even more accurate than Facebook likes. People are less aware of their browsing history, he says, and don’t censor it in the same way they do with public Facebook likes. In the future, Facebook likes could be mixed in with other information, such as status updates, songs listened to, browsing logs and data that can be collected by a mobile phone (tone of voice, how often you talk, heartbeat) to predict personality traits more accurately, as well as to track (and even predict) current and changing states of mind.
This, Wu says, has lots of practical applications. “Recruiters could better match candidates with jobs based on their personality; products and services could adjust their behavior to best match their users’ characters and changing moods,” she says. “People may choose to augment their own intuitions and judgments with this kind of data analysis when making important life decisions, such as choosing activities, career paths or even romantic partners.”
But there are also dangers to having machines that can judge people’s personalities and emotional states, says Kosinski. “Like any other technology, this technology is morally neutral, but it can be used for a bad purpose,” he says. “For example, knowledge of psychological traits can help me exert influence over you.” The risk, he says, is that people will lose trust in cellphones and online environments, which is why he believes people should be given control over their own data and the authority to decide whether it will be shared with certain companies.
Nevertheless, says Kosinski, there is a type of “magic” in the paper: “a very vanilla, very standard, very generic statistical model can predict something that was so far considered to be kind of a deeply human skill.”