Graph Algorithms: Minimum Spanning Trees for Social Network Analysis

Social networks are software platforms where users share their ideas and engage with each other. These platforms are a great place for people to gain exposure for their work and build relationships with like-minded people. In addition to massive social networks such as Facebook, Twitter, Instagram, Snapchat, Youtube. There are many other sites with emerging social networks. Medium and Quora for writers, Github for programmers, Codepen.io for front-end programmers, ProductHunt for entrepreneurs, and many more in many different domains. These sites rely on user-uploaded content to develop a compelling site.

The focus of this paper is to use graph algorithms to strengthen these social networks, help user-created content reach the people who would be interested, and encourage users to interact and post more content on these sites.

First, let’s illustrate a graph:

Now, let’s redraw this graph replacing the Vertices with the names of users in our Social Network. Each user in the graph is selected due to similar interests that they have demonstrated in their bios or articles. For example, all the users in this graph: Connor, Raye, Dakota, Daniel, Erika, Dermot, Camille, and Henry, are on Medium writing about Deep Learning. Each Connecting Edge represents users in the social network who are interacting with each other, such as liking or commenting on the articles written by the other user.

We can see that Erika is very connected in the network, interacting with Raye, Daniel, Dermot, and Camille. Connor is the least connected on this graph, only interacting with Dakota and Dermot.

Now, we let’s add the weights on edges back:

These edges quantify how much Users are interacting with one another. Dermot and Henry exchange 5 likes or comments every 24 hours, Daniel and Henry exchange 1 like or comment every 24 hours, Connor and Dakota exchange 11 likes every 24 hours, ….

The edges in this graph represent the number of interactions between users every 24 hours. If there is not an edge between nodes, this means that these users are not interacting with each other.

This is one strategy to quantize edges. However, we can explore many different representations in future studies. In a social network, there is usually a measure of reputation, most commonly this is quantified as ‘followers’. Typically, if a user has 1,000 followers and another user has 5 followers, it is unlikely that these two users are going to be collaborating on these social network sites. So in the graph representation above, the weights on the edges represent the differential in follower count. Instead of the sparsely connected graph illustrated above, we would use a Complete Graph with every vertex connected to one another.

Complete Graphs. In these graphs, every vertex is connected to every other vertex. We can use the edges in these graphs to represent the follower count differential, (or any other distance function) between users. We can find Minimum Spanning Trees in the algorithm to figure out which relationships to strengthen first to increase connectivity in the network.

In the graph discussed in this article, we are trying to create a complete graph from a sparsely connected graph. We will connect edges by recommending users to one another. However, we will need to do some analysis on the graphs in order to find the optimal strategy for recommending users in order to connect every user, thus forming a complete graph.

Now that we have seen how social networks are represented as a graphs. We will look at how we can use graph algorithms to increase the connectivity of this graph and improve social networks.

Minimum/Maximum Spanning Trees

A Minimum Spanning Tree is a subset of edges in the graph such that every Vertex is connected and the sum of the edges is minimized. On the other hand, a Maximum Spanning Tree is where the sum of edges is maximized. The idea of this algorithm will be to find the Maximum and Minimum Spanning Trees and then optimize our user-recommendation system to get these two trees as close to each other as possible. This will create an overall more robust social network, with more users interacting with one another. It is important to think about what we are trying to optimize for with these algorithms. In this example, we are trying to create a complete graph and increase the average weight of each edge. However, in your applications you may want to focus on optimizing for the overall total interaction count.

The Double Edges represent the edges connecting a maximum spanning tree. Min/Max spanning trees can be computed using Prim’s or Kruskal’s algorithm.

Now, let’s show the Minimum Spanning Tree

Comparing these two trees will show us which edges we should begin to connect in order to reduce the difference between the Min and Max trees. This will help users who are not as connected in the network find other users. The effect of connecting more users with each other will ultimately benefit all users who are a part of the network.

Conclusion

Hopefully, this article provided some value to you. These are my thoughts on graph algorithms for social network analysis, and I am still working on further development. If you read this and have any ideas on how it can be improved or have seen similar work that would be relevant to this study, please share it with me! Thank you for reading, please follow for more articles on graph algorithms and social networks.