Saturday, July 28, 2012

Google Refine

Humans are really good at interpreting data but they aren't fast. They can see the context in disjointed information and find the patterns to make it into a structured and usable whole but if they have to do repeated corrections it can be very labor intensive,

With the recent attention to Big Data, incredibly large datasets produced as a byproduct of worldwide online activity, there is a concern that messy data will be impossible to resolve at that volume.

I've recently seen Google Refine which puts the power of human meaning making into large datasets. It provides the user the ability to find patterns and apply global changes to them to easily create more robust data. 

When I was a computer consultant I was often asked to help clean up dirty data. I knew all sorts of tricks in Microsoft office but they pale in comparison to what I see in Google Refine. I'm going to experiment with it and see where it can be an advantage to Heretics who are looking for Big data as a powerful new tool.

NOTE: Google Refine has been changed to an open source project named Open Refine. You can view the documentation here: https://github.com/OpenRefine/OpenRefine

No comments:

Post a Comment