“Dataclysm” by Christian Rudder

  1. You’re a professor or postdoc who wants to push forward, so you take what’s called a “convenience sample”—and that means the students at your university. But it’s a big problem, especially when you’re researching belief and behavior. It even has a name. It’s called WEIRD research: white, educated, industrialized, rich, and democratic. And most published social research papers are WEIRD.
  2. With data, history can become deeper. It can become more. Unlike clay tablets, unlike papyrus, unlike paper, newsprint, celluloid, or photo stock, disk space is cheap and nearly inexhaustible. On a hard drive, there’s room for more than just the heroes.
  3. The promise of Facebook’s timeline: for the passage of time, data creates a new kind of fullness, if not exactly a new science.
  4. Beauty is looks you can never forget. A face should jolt, not soothe. For as with music, as with movies, and as with a wide variety of human phenomena: a flaw is a powerful thing. Even at a person-to-person level, to be universally liked is to be relatively ignored. To be disliked by some is to be love all the more by others.
  5. The “Pratfall Effect”: as long as you’re generally competent, making a small, occasional mistake makes people think you’re more competent. Flaws call out the good stuff all the more. This need for imperfection might just be how our brains are put together.
  6. The pleasant scent given off by many flower, like orange blossoms and jasmine, contains a significant fraction (about 3 percent) of a protein called indole. It’s common in the large intestine, and on its own, it smells accordingly. But the flowers don’t smell as good without it. A little bit of shit brings the bees. Indole is also an ingredient in synthetic human perfumes.
  7. When you want to learn about how people write, their unpolished, unguarded words are the best place to start, and we have reams of them. There will be more words written on Twitter in the next two years than contained in all books ever printed.
  8. Everything points to the same conclusion: that Twitter hasn’t so much altered our writing as just gotten it to fit into a smaller place.
  9. Like knowing a man’s last words to his wife, knowing how people talk among friends gives you a much deeper sense of who they are.
  10. We are living through writing’s Cambrian explosion, not its mass extinction.
  11. Variety is the preservation of an art, not a threat to it.
  12. “The Strength of Weak Ties”: A concept postulated in the 1970s with samples in the dozens, but since amplified on new, robust network data: it tells us that it’s the people you don’t know very well in your life who help ideas, especially new ones, spread.
  13. Try out this test: dataclysm.org/relationshiptest
  14. As Steve Jobs said, “People don’t know what they want until you show it to them.” What he didn’t say is that showing them, especially in tech, means playing a game of Pin the Tail on the Donkey with several million people shouting advice.
  15. It’s a social scientist’s curse—what you most want to get at is exactly what your subjects are most eager to hide. This tendency is called social desirability bias.
  16. Mathematical models already exist to predict the outcome of armed conflict—how long it will last, who will win, and how many people will die—and the models of late have learned to accommodate guerilla warfare, since that’s the shape of today’s war. But armed insurgency is often preceded by unarmed unrest—which itself is often propagated, even coordinated, through social media.
  17. The counterintuitive relationship between the popularity of a word (its rank in a given vocabulary) and the number of times it appears is described by something called Zipf’s Law, an observed statistical property of language that, like so much of the best math, lies somewhere between miracle and coincidence. It states that in any large body of text, a word’s popularity (its place in the lexicon, with 1 being the highest ranking) multiplied by the number of times it shows up, is the same for every word in the text. [rank x number = constant]
  18. Gay people are a somewhat unusual minority, in that they can seem straight, at least superficially, if they decide they must. This surely involves a painful choice between self-preservation and self-expression that few other people ever have to weigh. But aside from the clear cost to the individual, “the closet” costs our society too, as secrecy allows old attitudes to go unchallenged—and prejudice unchallenged is prejudice perpetuated.
  19. I won’t rehash the ways sites like Facebook, Twitter, and Instagram give you the power to project yourself to the world. But I will point out that not long ago, only big companies, with big budgets, could get their message heard and beloved by strangers halfway around the globe. Now I can, and so can you, and so can everyone. The hardest part is getting anyone to listen.
  20. #ff and #teamfollowback. The first stands for “Follow Fridays” which was an old-school tradition on Twitter—on Fridays you would tweet out people you like for your followers to follow. It’s not just general (any time) shorthand for “hey follow these accounts,” and commonly blasted out by users just trying to drive numbers. #teamfollowback, is the hashtag/handle for a Twitter account that basically does for free what politicians can afford to pay for. The idea is that you follow TeamFollowBack, and the account’s other followers will follow you. You then, in turn, follow them back, and everybody’s numbers have risen.
  21. Jenna Wortham from the Times describes this mentality well: “We, the users, the producers, the consumers—all our manic energy, yearning to be noticed, recognized for an important contribution to the conversation—are the problem. It is fueled by our own increasing need for attention, validation, through likes, favorites, responses, interactions. It is a feedback loop that can’t be closed, at least not for now.”
  22. In my mind—and this takes nothing away from Malcolm Gladwell (see “Blink”, “David and Goliath”, and “The Tipping Point”)—I see this book as the opposite of outliers. Instead of strays from the far reaches of the data—the one-offs, the exceptions, the singletons, the Einsteins for whom you need the whole story to get it right, I’m pulling from the undifferentiated whole. We focus on the dense clusters, the centers of mass, the data duplicated over and over by the repetition and commonality of our human experience. It’s science as pointillism. Those dots may be one fractional part of you, but the whole is us.
  23. Exif is attached to all images taken with a digital camera, from high-end SLRs to your iPhone. The file encodes not only when the picture was taken but miscellany like the f-stop and shutter speed for the photo and, often, the latitude and longitude of where it was taken. Exif is how programs like iPhoto can effortlessly sort your pictures into “moments” and place little pins all over the map to show you where you’ve been.
  24. The era of data is here; we are now recorded. That, like all change, is frightening, but between the gunmetal gray of the government and the hot pink of product offers we just can’t refuse, there is an open and ungarish way. To use data to know yet not manipulate, to explore but not to pry, to protect but not to smother, to see yet never expose, and, above all, to repay that priceless gift we bequeath to the world when we share our lives so that other lives might be better—and to fulfill for everyone that oldest of human hopes, from Giglamesh to Ramses to today: that our names be remembered, not only in stone but as part of memory itself.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s