I’ve already written about Christian Rudder’s Dataclysm: Who We Are When We Think No One’s Looking, which looks at samples of social networking data to study human perception and behavior. Rudder knows that mere representation of data may not be enough to teach about data use. After all, as data producers, we have a right to use our own data (and to see how our data is used). So Rudder and his team offer a neat activity. At datacysm.org/relationshiptest, anyone can plug in their own Facebook data and have the site analyze their network structure. This is the relationship test. This test measures just how central your partner is to your life using tie strength, an informal measure of the “closeness” of a friendship.
This test uses recursive dispersion method for determining tie strength. Previously, many social networked data analysis used embeddedness as a factor for determining centrality. People structure their networks around certain foci. One focus can be a number of co-workers; another can be the people you went to college with. My Facebook friend structure looks like this (with me in the center):
Which of these blue dots (nodes) represents my partner of 10 years? Which my best friend of 8 years? Many of these friends share mutual friends with me who are in the same focus. We call them highly embedded in this focus. However, they may not have ties to other foci.
As you can see, I have two areas of focus that contain deeply embedded mutual friends. Everyone to the left of my network is pretty evenly spaced. Now, the embeddedness analysis is not very reliable for characterizing romantic relationships because it gives equal weight to all linked members of a focus. The reality of social networking is that highly embedded links surrounding foci is not a predictor of a strong tie.
As you can see, my strongest ties are in that sparsely spread out field to the left on my node cloud. My partner and my best friend share a few links to focus neighbors, but they have a very low embeddedness factor– indicating that their social orbits are not bound within any one focus. Recursive dispersion looks not at just the number of mutual friends two people share (embeddedness), but also at the network structure of these mutual friends. A link between two people has a high dispersion when their mutual friends are not well-connected with each other. Let’s look at some numbers.
As you can see from the network node cloud, my partner is not highly embedded within any one focus. However, according to the numbers, we share the most mutual friends at 43. That means we have a high dispersion rate with our mutual friends belonging to more than one social focus. He has a high number of links to nodes on the left of the node cloud, with a few links to nodes in my graduate school focus, and no links to nodes in my work focus. Why is that? Well, he just hasn’t met many of them in a meaningful context yet. I have only just started interacting with them in a digital environment myself. This method gives more weight to the person who has more dispersed mutual friends, because it correctly assumes that romantic relationships follow a different social structure than other relationships.
Another interesting tidbit that I looked at were the numbers for my best friend. Her node also falls on the evenly spaced left side of the node cloud. According to the numbers, we share 16 mutual friends, but only have an assimilation score of 116701. Why is that? Well, what the numbers don’t tell you is that my best friend lives in Japan, so while we have 16 mutual friends (out of my 146 total friends), her total friends (at 212) consist of a lot of people that I will likely never meet and will likely never be able to communicate with due to the language barrier.
Strangely enough, for me, the nodes on those clusters of highly embedded social foci on the right of my node cloud are people I engage with less frequently than most of the dispersed nodes on the left of the cloud. I don’t currently hold a professional position using my graduate degree, so my engagement with that focus is low. I also just moved to a new city relatively recently, so my work cluster consists of people I am still getting to know. My strongest ties are to less embedded people: my partner, my best friends, my family members, life-long friends.
This is all fascinating and revelatory. Data science can show us that assumptions we make on seeing data representations may not be accurate–that findings can actually prove to be the opposite of what we think we see. The first instinct is to see highly embedded clusters of information and think of them as consisting of strong ties, but life doesn’t work that way. The strongest ties are to people who we let in to every part of our lives, not just the clusters around one aspect. All of those dots are people I let into my life, but those evenly spaced ones on the left are the ones I keep.
We don’t need to be afraid of data. We don’t need to live paranoid lives about what people are reading in our data, especially when data science can lead to greater understanding of how we use seemingly ubiquitous tools like Facebook to structure our relationships. That’s me in those numbers–a me I never imagined. Technology is meaningless without the people who award it its value. We can use it to connect with the world and to measure and quantify that connectedness. To me, that’s true progress.
You can take this test for yourself! It’s at dataclysm.org/relationshiptest. And you can read more about it by picking up a copy of Dataclysm: Who We Are When We Think No One’s Looking.