Why the Public Data Explorer?
Following on the heels of efforts in the US and UK to get public data online and mined in an infinite variation of ways, Google Labs has now launched the Public Data Explorer. According to Google, this project "makes large datasets easy to explore, visualize and communicate."
As far as brand new ideas and technologies go, this one's a bit of an also-ran (and is built on tech that Google acquired in 2007). Everyone and their brother is going after the issue of making sense out of massive public datasets right now. Why now? Because we finally have two critical pieces of the puzzle:
- The raw computing power to handle extremely large public datasets
- The in-development protocols for the semantic web, which make it possible to break these datasets down into context that computers can make sense of
So No Big Deal?
Well, don't be so fast to judge. The issue of visualizing and exploring large public datasets is a young field with plenty of room for advancements. A lot of the charts today's tools generate are complex and incomprehensible to the average person, though they're exciting as far as the bleeding edge of technology.
There's plenty of room to innovate in this space, and in ways no one has thought of yet. So an organization with the size and collective mind-power of Google has a shot at making leaps that could amaze us all. Or not. It's all in the implementation.
Exploring the Explorer
Right now, you can go to the Google Public Data Explorer page and choose one of about a dozen datasets that Google's currently using for this project (they're looking for more, go here to suggest additional public datasets). Click on the data that you want to explore and you'll go to the default visualization for that data:
The Google Public Data Explorer's view on the World Bank's data on World Development Indicators.
In the case of the World Bank's data on World Development Indicators, I'll narrow the countries down to compare information from Australia, Belgium, Canada and Hungary by clicking the appropriate country boxes on the left. Then I might change the Y axis to show the percentage of the total number of births in those countries attended by skilled health staff on a linear scale, and leave the X axis to show the life expectancy at birth.
The Google Public Data Explorer's view focusing on births attended by skilled health staff.
When I hit the play button, I can watch the life expectancies of my selected groups change from 1987 through 2006, and how they slowly rise (pushing right along the X axis) over the years. Clicking graph type indicators lets me choose what type of graph (scatter, line, etc) I'm dealing with. If I hover over a data point in the scatter graph it tells me what country it's associated with.
The really exciting moment will be when one of these projects produces graphs that are easily read at a glance without needing heavy interpretation. But given how young the field is at this point, people with the knowledge to interpret the results themselves can certainly get immediate benefit. It just doesn't matter yet which of the tools they really choose. No one's broken ahead of the pack so far.
What do you think? Am I getting too jaded? Or is the Google Public Data Explorer the best thing since sliced bread?