Standard data is never going to allow us to distinguish in any satisfactory way between homophily (in which people are more likely to have social relationships because they’re like each other) and influence (in which people are more likely to become like each other because they have relationships).
The big fad in academic concepts watered down for public consumption right now appears to be "Big Data," which in quite a few of the accounts that indulge in claims about Big Data's implications, is something that will end the scientific method as we know it. Very crudely (but let's be honest, the books speak crudely), the claim is that vast quantities of data will obviate the need to formulate hypotheses prior to examining the evidence.
People said the same thing when statistics was first coming into its own in the early 20th century. It's wrong. And the reason it's wrong is that before-the-fact hypothesizing is the only way to make a causal claim that can then be compared to the evidence and defended or discarded.
Big Data allows for lots of new questions to be examined, but it doesn't change the way questions must be asked to advance understanding. That's what the quote above shows us. No matter how many regressions are thrown at a pool of data, the act of carrying out statistical tests will never answer the question of why things coincide, and thus cannot tell us the direction of some causal chain, the nature of that causality, or even whether that causality exists.
The best book I know on how large quantities of data are changing how things are done is Sasha Issenberg's The Victory Lab, which traces how data-driven experimental research became the driving force behind innovation in political campaigns. The moments of discovery in that book are always the result of careful human intervention--a designed experiment to test a specific hypothesis, a perceptive analyst who notices previously unseen coincidence of two metrics. That book's account shows that sometimes, for practical purposes, it is enough to recognize that two things occur together. At the same time, it shows the sometimes serious dangers that result from not figuring out causal relationships, as when it relates stories of voters misidentified and sent targeted mail that only made them angry. Over and over in Issenberg's book, the subjects push for more and better data, but are only able to effectively exploit the data when they carefully formulate hypotheses and then test them.
Big data is a big deal, but it isn't going to change how science works.
Speaking of which, http://xkcd.com/54/












