Facebook is one big science project at this point. Sure, the engineering and design challenges are myriad, but many of the problems (and opportunities) are really in the realm of social science. How does information pass through a network of people? What makes someone more likely to share something or to converse? Is it the information itself that’s most important, or is it a question of the relationship between the individuals?
Two of Facebook’s data scientists were in Cambridge today presenting on big data at EmTech, the conference by MIT Technology Review, and discussing the science behind the network. Eytan Bakshy and Andrew Fiore each have a PhD and have held research or lecture positions at top universities. Their job is to find value in Facebook’s massive collection of data.
And their presentation underscored, unsurprisingly, the academic roots of their work. Fiore, for instance, cited the seminal 1973 sociology paper on networks, The Strength of Weak Ties, to explain Facebook’s research showing that we’re more likely to share links from our close acquaintances, but given the volume of those weaker connections, in aggregate weak ties matter more. As Facebook attempts to extract value from its users, it’s standing on the shoulders of social science to do it. It may seem banal to point out, but its insights are dependent on a rich history of academic research.
That’s worth remembering considering that the balance of power in the research arena is shifting rapidly. Data used to be a lot harder to come by, and social scientists often went to substantial lengths to generate it. Today, vast amounts of interesting data about human life exists in the confines of corporations. As The New York Times put it in May:
It is “big data,” the vast sets of information gathered by researchers at companies like Facebook, Google and Microsoft from patterns of cellphone calls, text messages and Internet clicks by millions of users around the world. Companies often refuse to make such information public, sometimes for competitive reasons and sometimes to protect customers’ privacy. But to many scientists, the practice is an invitation to bad science, secrecy and even potential fraud.
Select academics do frequently get granted access to data at companies like Facebook to conduct and publish research (though typically not the datasets), and some researchers manage to glean public data by scraping the social network. But not all researchers are satisfied. After tweeting about the issue, I heard from Ben Zhao, an associate professor of Computer Science at UC Santa Barbara, who has done research on Facebook. “I think many of us in academia are disappointed with the lack of effort to engage from FB,” he told me over email.
The research mentioned above and presented at EmTech was published earlier this year, by Facebook, on Facebook. Which is great. But it points to the power that Facebook, Google, and others now have in the research environment. They have all the data, and they can afford to hire top tier researchers to work in-house. And yet it’s important that the insights now being generated about how people live and communicate be shared with and verified by the academic community.
Facebook has already learned a lot from academia, and it’s in all of our interests to make sure academia can continue to learn from Facebook.
UPDATE 10/24/12 4:58pm: This post has been updated. The original version included a quote via tweet which was technically inaccurate concerning Facebook’s reaction to research on its platform.