We prepared the following Jupyter Notebooks to demonstrate the concepts covered this week.
The links below point to static (HTML-based) versions of the notebooks - they look just like the real thing, but you cannot modify any code or interact with them in any way.
Notebook 2.1 - Introduction to NetworkX
In this notebook we introduce NetworkX - a Python package that includes a wide range of graph theory algorithms. We will learn how to create a graph, visualize it, and also how to obtain the shortest paths. We are also experimenting with a few centrality measures.
Along the way, we will be learning how to load external files and store them in Pandas objects. You will need to download the datasets.
Notebook 2.2 - Weighted and Directed Graphs
In this notebook we will be using NetworkX to study weighted and directed graphs. We will be building on the concepts that we followed in Notebook 2.1, and will also introduce and use custom function definitions, which will allow us to reuse parts of our code without copying-pasting all the time.
Notebook 2.3 - Studying the London Underground
Here apply the concepts that we covered in the previous notebook into the study of a much larger network - the London Night Tube.
We will be using some more advanced Pandas features to filter and clean up our dataset, as we seek to highlight certain outputs of our analysis. If you haven’t already for Notebook 2.1, you will need to download the datasets.
Notebook 2.4 - K-means Clustering
This notebook demonstrates the use of the K-means algorithms provided by the sklearn module. If you want a taste of Machine Learning, this is a great notebook to try. Sklearn, or also known as scikit-learn, is a Python module for Machine Learning.
Notebook 2.5 - The Hungarian Algorithm for Assignment
In this final notebook we are presenting a few different implementations of the Hungarian algorithm that you could use in order to solve assignment problems.
Running the notebooks
You can download a zip file that contains all notebooks (including the datasets used in the examples) from the link below.
Before you run the notebooks, you will to ensure that you have have a properly set up Anaconda environment (as discussed in last week’s seminar).
A range of Python packages are used by the notebooks. Most of them will be installed by default through the Anaconda installation, but you can also check manually whether they are present using the pip install
command.
The packages that you need to make sure that you have installed are:
numpy
matplotlib
pandas
sklearn
networkx
yellowbrick
seaborn
For a reminder on how to set up your Anaconda environment and install additional Python packages, you can refer to the documents provided in Seminar 1.