The goal of this Data Expedition was to introduce students to the exploration of social networks data using R. Students learned to load and plot a social network in R and then perform some basic analyses on two different networks: Hockey Fights in the National Hockey League in 2018-2019 and characters in Game of Thrones Season 3. Students used social network analysis to better understand who is connected to whom, how frequently they interact, and how they are interacting.
Graduate Students: Claire Le Barbenchon and Liann Tucker
Faculty: Craig Rawlings
Course: Soc 110D: Sociological Inquiry
First, we introduced students to Rstudio and the basics of R programming. Next, we covered loading data into RStudio and transforming the Hockey Fights network from a “hairball” network to a legible and labelled network of fights between players. Students then re-organized the plot to show fights between teams and identify teams more prone to fighting than others.
Next, students dug into the Game of Thrones Season 3 network and created plots that show the links between houses in the show. They explored some of the different descriptive and analytical tools commonly used in social network analysis: different forms of centrality to understand popularity and influence.
Social network analysis can help explain social dynamics at both the group and individual level. For example, certain network structures are better for spreading information, others for maintaining power hierarchies.
This project guides students through plotting a network in an elegant, intuitive way, which helps to answer descriptive questions like:
- What does this network look like?
- How is it organized?
- Who are the members and how do you think they are connected?
Next, we explore structural features in the network and will be guided by the following questions :
- Who are the most “popular” characters?
- Which characters and houses have the most influence?
In order to do so, students were introduced to ways of calculating different centrality types using igraph (degree centrality and betweenness centrality).
Ultimately, learned how to compare and contrast two different network structures in order to learn about social phenomena and be given the basic tools to do so.
The plots below show the hockey fights network organized by players and teams. The Game of Thrones network is shown both mapped by character, and with nodes scaled to character degree centrality.
Dataset #1: Game of Thrones Season 3
Our first dataset was collected by Andrew Beveridge of Macalester University for an exploratory network project. It is based on Season 3 of the HBO series “Game of Thrones.” The data contains an edge-list (file of interactions between characters) and a node list (a second data file that contains each character mentioned) and added “House” variables to each observation. We end up with two data files we can teach students how to turn into a network. The file of characters contains 123 observations for 3 variables, and the file of ties between the characters contains 500 observations for 3 variables. Ties are created by characters being mentioned together, speaking about the other, in a scene together, in the same stage direction, or speaking one after the other.
Dataset #2: Hockey Fights (2018-2019)
Our second dataset includes all of the NHL players for the 2018-2019 season. The data was collected and compiled by Liann Tucker (co-writer of this proposal) for her personal network analysis interests, aggression and roles in networks. The data were collected from the National Hockey League official website and hockeyfights.com. The hockey fights website includes all the fights in each season, though it was not set up as a useable dataset, so one was created. These data are separated into two files, players (nodes) and their attributes being the first file and fights (edges) in the second file. The planned exercises include teaching students how to put these two files together to make a network. Since the players are separated into teams, we can create a bipartite network (2 level) out of this dataset. There are 786 observations in the nodes file and 226 observation in the edges file. The nodes file contains information on players such as their team, when they were acquired by their team, and position. The edges file contains each observed fight between two players, and who won the fight (according to voting on the hockey fights website). This data is open access, since it was collected from two public websites and created by a co-writer of this proposal.
A. Beveridge & Shan, J. (2016). Network of Thrones: A Song of Math and Westeros (Season 3), [got-s3- edges.csv, got-s3-nodes.csv] Retrieved from https://github.com/mathbeveridge/gameofthrones/tree/master/data
Tucker, Liann (2019). Hockey Fights Network 2018-2019,[nhl_nodes.csv, nhl_edges.csv] Retrieved from https://www.hockeyfights.com/fightlog/1/reg2019/8