In this notebook we will begin our exploration of the network analytics landscape, using Python and the networkx
package.
NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of networks - it is used by scientists and practitioners alike, accross a range of applications. You can find more information about it at its official website (https://networkx.org).
We assume that you already have a properly set up coding environment, and that Jupyter has been properly set up. Before we start, we need to make sure that you have installed the networkx
package locally.
From the terminal you can do this from the using the pip install networkx
command. However, we can also do this from within the Jupyter environment using the command below.
!pip install networkx
Requirement already satisfied: networkx in /Users/pa01/opt/anaconda3/lib/python3.9/site-packages (2.7.1)
In the above cell, note of the !pip
command, instead of pip
. The exclamation mark (!
) tells Jupyter to redirect the command that follows to the command line, and can be used with any command. For instance, if we want to get a list of the files in our directory, we can run !ls
(on Mac/Linux) or !dir
(on Windows).
!ls
data-london-underground data-sioux-falls n01_introduction_to_networkx.ipynb n02_working_with_datasets.ipynb n03_network_analytics.ipynb n04_weighted_and_directed_graphs.ipynb n05_studying_the_london_underground.ipynb n06_kmeans_clustering.ipynb n07_hungarian_algorithm.ipynb tsl-logo.png
We begin by importing our packages. It is usually a good practice to import all the packages that we are using in a notebook at the very beginning.
There are quite a few packages that we will be using, but to keep things simple, on this occasion we will be only importing networkx
.
import networkx as nx
The above command imported networkx
, and created an alias (shortcut) named nx
to the package. This will allow us to call all commands that are provided by the package, usingnx
instead of networkx
as a qualifier.
We did not select nx
arbitarily - if you look online, you will notice that it is a common convention in most python codes to use nx
when importing networkx
.
G = nx.Graph()
With the above command we are creating a new empty graph.
Let's explore how it works by adding a few nodes and edges.
G.add_node(1)
G.add_node(2)
G.add_node(3)
G.add_node(4)
G.add_edge(1, 2)
G.add_edge(2, 3)
G.add_edge(2, 4)
Now that we have populated our graph, we can visualise it using the nx.draw()
command.
Remember, since we used the as
keyword when we imported networkx
, this is exactly the same as calling networkx.draw()
nx.draw(G)
Which is exactly what we were expecting! No surprises here.
At any time we can view the nodes and edges in a graph by invoking the .nodes()
and .edges()
functions that are provided by the Graph
object.
G.nodes()
NodeView((1, 2, 3, 4))
G.edges()
EdgeView([(1, 2), (2, 3), (2, 4)])
It is always helpful to visualise the graph. To make it a bit easier to understand, we can also add lalels by passing the with_labels=True
parameter.
nx.draw(G, with_labels = True)
The nx.draw()
function is itself a shortcut to another function provided by networkx
, called networkx.drawing.nx_pylab.draw_networkx()
.
I guess we can all agree that nx.draw()
is simpler to write!
The nx.draw()
function has many more useful parameters that you can use to customise the appearance of your drawings. You can find a full list and more guidance on this page.
For instance, the default font color in the labels is black, which makes them rather difficult to read. Let's try something different:
nx.draw(G, with_labels = True, font_color = 'white')
What if we wanted to use square nodes?
nx.draw(G, with_labels = True, font_color = 'white', node_shape='s')
I would encourage you to expariment a bit more with the parameters at your own time. nx.draw()
draws its functionality from the matplotlib.pyplot
library, which we will be using quite a lot in this course - therefore quite a lot of the features that you use are more widely applicable.
It would be nice now to experiment with some larger graphs, however it would be tedious to define them manually. Thankfully, networkx
provides some built-in graph generators which you can use in order to quickly create graphs that you can experiment with.
One of these is the Petersen graph generator, a graph instance that is used quite widely in graph theory. You can find more about it here.
G = nx.petersen_graph()
In theory, our graph should look like this:
Let's see what networkx
draws...
nx.draw(G,with_labels = True, font_color = 'white')
Upon first sight, they do not appear to be quite the same!
Look closer - they have the same number of nodes and edges, and the same connectivity pattern. It is in fact the same graph as the above - it simply lacks information regarding the position of the nodes.
Whenever a graph lacks infromation regarding node positions, networkx
will pick some on its own. These are calculated afresh every time that the nx.draw()
command is called:
nx.draw(G,with_labels = True, font_color = 'white')
Now let us experiment with the built-in shortest path algorithms that are provided by networkx
.
In the first instance we will try the nx.shortest_path()
command:
path = nx.shortest_path(G, source=0, target=9)
path
[0, 4, 9]
It would be nice if we could draw the path in the table. To do that, we will have first to convert the sequence of nodes, to a sequence of links.
edges_path = list(zip(path,path[1:]))
edges_path
[(0, 4), (4, 9)]
There are quite a few things happening on the first line of the above block, but we will not be going into detail right now. You will get the hang of the the list()
and zip()
functions quite soon!
Now, we will create a list that assigns the color of an edge depending on whether it is included in the shortest path that we determined.
We are going to do this by going through the list of edges and checking whether each specific edge belongs to our path. If it does, it will be given a red color.
edge_colors = ['black' if not edge in edges_path else 'red' for edge in G.edges()]
edge_colors
['black', 'red', 'black', 'black', 'black', 'black', 'black', 'black', 'black', 'red', 'black', 'black', 'black', 'black', 'black']
The sequence of colors follows the sequence of the edges in the graph, therefore we can provide it directly to nx.draw()
and it will be able to highlight the correct edges.
nx.draw(G, with_labels = True, edge_color= edge_colors, font_color = 'white')