5 ways to gather data

  • 1

5 ways to gather data

Tags : 


 Caroline Beavon is a freelance information and infographics designer – get in touch for more details

linkedin


Everyone is talking about data journalism nowadays: creating maps, visualizations and infographics. However, before you can do any of that you need some DATA

Here is how I sourced the data for my Datamud project, a look at the statistics behind the big UK music music festivals.

1. SEARCH

Official Site

The last thing you want to do is call up a press officer asking for some stats, when they are there, for all to see, on the website. Dig around in any areas labelled information, statistics, FOI and Press Area. Often companies will post useful statistics if they are often requested,but they won’t necessarily make those statistics easy to find.

The Glastonbury Festival Educational Resources area is rich with information. A series of PDF’s contain details about every element of the event – from crowd management, security, stalls, sanitation etc. As the UK’s largest festival is is often the subject of assignments and reports. This was useful as I looked for recycling information to back up the organisers claims that they are a green event.

Google

Google is a wonderful tool – it not only searches websites, but also blogs, news postings, pictures and videos. It’s well worth checking the NEWS section as someone else may have already done similar research and posted the stats online.

Unfortunately a search can return thousands of pages, so you need to be smart when submitting your search. Inverted commas around a phrase will search for those words as written, but combined with simple searches it can be a useful tool.

e.g. “were arrested” 2010

Don’t forget to check the later pages of the search too – sometimes you will find some juicy stuff buried on the less Google juicy sites.

Governing Bodies

Often Google won’t be able to pick up deep linked pages, or documents embedded or linked in pages so it’s always worth looking at official agencies and Governing bodies websites too.
Councils and the Government are now much better at archiving their agendas and minutes and whilst the search facilities are still pretty archaic and frustrating, it’s a start.

None of the various police forces websites had the crime stats that I needed, although they do often have documents that may be of use e.g. Leicestershire Police

Search / Scraping Sites

Although I did not use this during this assignment, in retrospect using a site like Scraperwiki to access data from an official site would have saved me a lot of time. I could have used it to draw together all the line ups, for example, instead of a long winded cut-and-paste process, and plenty of cleaning up.

Nowadays there are also sites that have done a lot of the work for you, by monitoring official sites and databases and turning the data into an easy to handle format.

First stop should be What Do They Know – a site geared up around FOI requests (more on this in a moment) but also you should definitely visit TheyWorkForYou (I set up an alert in regards to the Glastonbury festival, which would tell me whenever it was mentioned. My hope was that the crime levels, or crowd management would be raised at some point and reference to information given.)

Interest Sites

I mentioned Google News search above, but it’s also worth looking for sites that deal with the specific subject area. They may have useful resources but may not appear on page 1 of a Google Search.

When I was compiling lists of the bands playing the various festivals, often the official sites were clunky or the names were shown on a JPG of the official event poster. However festival news/interest sites, such as EFestivals, present the information in a more useful way

2. ASK PRESS OFFICE

For archive or very up to date statistics, often a call to the press office is necessary.

I wanted to find out more about historical weather forecasts so a visit to the MetOffice website informed me that they had a library of data that could be accessed. Within one quick email conversation I was furnished with a link to a host of archive weather data with records often going back to the 1700’sIn CSV format, these were simple to manipulate and visualise.

Press Offices are used to to dealing with requests for information, its their job, as well as being happy to help you meet deadlines.

3. FOI

FOI requests are for those tricky bits of data othat an organisation is less reluctant to send out (for time, size, sensitivity etc issues). I set ONE FOI request, for crime stats to a police force, foolishly thinking this would be quicker than contacting the press office directly. It was not.

Use these if you do not need the information urgently (it can take up to a month from start to finish)

Interesting article on FOI Requests from Channel 4

4. CROWDSOURCE

Of course carrying out ryour own research is one way of gathering data, but this project relied on the theory that “many hands make light work”.

I wanted to find out how much it would cost to see the various mainstage bands, if you were to see them on their own headline tours. I could have spent DAYS trawling the internet ticketing sites (both UK and international) collecting the data. Instead I started a public Google Docs spreadsheet. Through the social networks I encouraged people to enter the prices of tickets they had recently bought. The database was soon a third full, and a chance message from an old friend (the man behind Ents24) completed the rest by gaining access to their database.

Google Docs is a fantastic way of collaborating and getting large jobs completed.

5. I GOT MY CALCULATOR OUT

This can be hard work if you are dealing with a lot of data, but for me it was feasible

I wanted to assess the nationalities of the various bands, and compare the overall nationalties of the different lineups. This involved a lot of searches on Myspace and Wikipedia (still both very useful resources for the facts about bands) and using visualisation Software Tableau.

In retrospect I should have doubled this database up with the ticket prices one, and asked people to fill in the nationalities of the bands as well. Hindsight is a wonderful thing.

 

Want more? – DATA JOURNALISM: MORE THAN NUMBERS AND CHARTS

 

 


 Caroline Beavon is a freelance information and infographics designer – get in touch for more details

linkedin



About Author

Caroline Beavon

A communication professional with 12 years journalism experience and a genuine passion for new technologies. An experienced blogger and social media user

1 Comment

Robert

June 18, 2012at 6:52 pm

THANK YOU! 🙂

“Caroline’s professionalism and skill set are equaled by her infectious enthusiasm and adaptability. Caroline is friendly and helpful whilst remaining honest, realistic and focused. A true pleasure to work with and a formidable positive force” Jonn Penney, Media Officer – Wolverhampton Civic Hall