By this point, Facebook has aggregated gobs of data from its 1.3 billion users. And it’s potent data, suggests a new study in Proceedings of the National Academy of Sciences: Computer models developed by the paper’s authors can use Facebook “likes” to predict a user’s personality, and in some cases can do so more accurately than that user’s friends or family members.
The short version of how it worked: The researchers had 86,220 volunteers take 100-item personality inventories, and then gave computer models access to 90 percent of those users’ inventories, as well as their Facebook likes. The models were then instructed to go wild on that data, figuring out which correlations could be drawn between certain types of likes and certain types of personality characteristics, and use those correlations to predict the personalities of the remaining 10 percent of the sample. These predictions were compared to those of each user’s human friends, who filled out ten-item surveys describing their Facebook friend’s (sometimes a distant work colleague and sometimes a spouse) personality.
This demonstration of Facebook’s power has naturally elicited some apprehension on the part of those of us who are human beings rather than computer programs — some outlets have covered this study by claiming it proves scary-sounding things about the social network. All sorts of headlines, for example, are stating that Facebook now “knows you better than” your friends or family.
To which the only response is: Eeeeeehhhhh … It’s actually more complicated than that. For one thing, there was a lot of variation in how many likes the computer models required to beat human accuracy. While access to fewer than ten likes was enough to generate a more accurate personality prediction than someone’s work colleague, for example, the models needed around 65 likes to beat a friend or roommate, 125 to beat a family member, and about 275 to beat a spouse. Those are a lot of likes!
There’s also the fact that the computer model had a huge amount of data to work with during the first phase, when it was building the model it used to make predictions: For each of more than 70,000 users, it had access to both those very detailed personality inventories (100 questions) and to all of that user’s likes. If it’s not surprising that certain likes correlate with certain personality characteristics, it’s also not surprising that if you feed enough of them into a smart computer program, it can start to use them to predict stuff pretty accurately.
To take a hypothetical example, if 90 percent of Taylor Swift fans describe themselves as extroverted, then there’s a pretty good chance that someone who likes Swift is extroverted. And there’s an even better chance that if you can also see all the other stuff that person likes and you know how those interests correlate to extroversion, your odds of determining whether they’re an extrovert are pretty darn good — especially if you’re a computer that can sort through gigabytes of these connections with ease.
What all this comes down to is that your preferences really do say a lot about you, and the more of them you share, the better a sense various online entities will have of who you are and how best to sell you stuff. Facebook doesn’t actually “know” you the way your friends and family know you, of course — it knows what you like, and it knows that because you’ve willingly clicked a button that says just that.