Sometimes I’ll start a data project, and duing the process one of several things may happen:
- I lose interest
- something more important comes along
- I realized it’s just not “working”
The latter happened as I was working on a piece for the latest Information is Beautiful challenge – which involved chomping through a large and very interesting data set regarding Hollywood movies.
These challenges encourage you to use a data set provided by the website to create either a visualization, napkin drawing ( sketch) or an interactive piece.
After several hours of looking through the data, looking for interesting angles, and hunting for more data I could add to the set (via Google Refine) I settled on looking at the connections between the actors involved in the top films.
Ask anyone who’s watched a film with me and they’ll tell you that I have a VERY annoying habit of opening Wikipedia to find out where I’ve seen a particular actor before. It’s annoying in everyday life, but for this – it was a dream.
(see below for more details and why I eventually shelved it)
What it Means
Yes, it’s a bit of a headache isn’t it? The original dataset with featured the films coming out of the major studios, over a certain time period.
I added actor information to the cataset from Freebase (within Google Refine) and worked out which ones had appeared in the most films over the time period. I then cross referenced the films and created the above chart.
I had originally intended to give each film a different colour but this became unworkable – so I limited the colours to the films featuring 3 or more actors on the chart. The rest I coloured in grey.
Why It Didnt Work
For one, it was too damn complicated – no chart should take 3 paragraphs to explain. Kinda defeats the point, right?
Secondly, the choice of resulting entries was nonsense.
- Original Data (limited to major studios)
- Actors added to each film (according to Wikipedia, via Freebase)
- Top 26 hardest working actors selected (based on original list, so ignoring independent or smaller budget films)
So why am I publishing it here?
Because I spent all day on it, I like the IDEA and design and I wanted to share my experiences of when to walk away.
I’d love to hear your experiences of when you’ve had to walk away – and why.