NOTE: This is very much a work in progress, so any advice, feedback or tips, much appreciated! Also some of the data used is from news reports/blogs and hence is of a speculative nature but has been included for demonstrative purposes.
As part of my MA Online Journalism I have been playing around with some data from the Glastonbury festival archives.
I wanted to show the statistical history of the festival, through a visual media.
I started a spreadsheet in Google Docs and used the ManyEyes site to create my charts.
Michael Eavis officially took over the regular running of the festival in 1981 and this is where I began my research. Using official data from the Glastonbury website, I built a spreadsheet of the standard weekend camping ticket prices and official capacity (later finding this all laid out in table form on an license application PDF!)
I started by comparing ticket prices, over the years, with capacity.
Interestingly this shows a DROP in capacity between 2005 and 2007 (there was no Glastonbury in 2006).
However, this only shows the official capacity. Glastonbury festival has had a long running battle with gatecrashers (or fence-hoppers) and I felt it would be interesting to compare the actual capacity with the official one.
Unfortunately, actual capacity is hard to come by – I gathered some information from news reports and blogs, although I accept these figures are largely speculative and may be inaccurate.
(On a personal note I was also concerned that, despite recent successful measures to prevent gatecrashers, according to some reports thousands of people are still getting into the site without paying. I am aware that there is constant scrutiny of the management of the festival and I did feel uncomfortable publishing speculative figures that could be taken out of context by critics)
I inputted the data into a scatter diagram, as above, but this did not clearly show the distinction between the 2 sets of data. I converted it into a simple bar chart which , in this case, is a lot more effective.
Although I still have some data to gather, it is interesting to see the sizeable spike in 1995, 1999 and 2000, which led to the festival being called off the following year for a “rethink”.
Next, I decided to compare the three sets of data – price, official capacity and actual capacity to see if there is was a link between the numbers of people “fencehopping” and the price of the ticket. Instead of placing all 3 data sets on one chart, I decided to create a fourth column, showing the difference between official capacity, and actual capacity.
The problem with this chart is currently the lack of data. I have plotted the years where I do not have estimated capacity, which is making the ones where I do seem dramatically out of sync. I will retry this chart once I have more data.
This is a work in progress, so any feedback or advice – much appreciated!
- Try Tableau
- create a Glastonbury chart with “events boxes” that explain the data – ie NEW FENCE, bad weather, Jay-Z headline controversy etc.
- create a word tree
- experiment with live data