An analysis of the network of characters in The Simpsons
The original dataset is from https://github.com/sghall/simpsons-episode-data with thanks to Delimited Technologies
The dataset can be found in /support, it consists of two main files.
- simpsonsNodes.csv
- the names of characters in the Simpsons with a description and Id
- simpsonsEdges.csv
- a source and target Id for the two characters who appear together along with a weight corresponding to the number of times they appear together
- simpsons_ep-char.csv
- original dataset that contains every episode and the characters who appear in them
Also in the /support are two python files.
- data.py
- converts the simpsons_ep-char.csv file into simpsonsEdges.csv without the weights
- data2.py
- adds the weights to rows in simpsonsEdges.csv
The analysis of the network was done using Gephi an open-source graph visualisation tool.
The report was produced for the coursework and contains the full analysis and many screenshots of the network from Gephi.