One of the things I talk to organizations about regularly when they’re trying to get their heads around big data and analytics is the importance of data quality before they start with analytics.  Here, I am reminded of the old database acronym of GIGO (garbage in, garbage out), and yet I am astounded by how many companies still skip this most important step.

data quality

For example, the other day I was talking to a business professional who could not understand why response rates to campaigns and activities were so low.  Nor why they couldn’t really use analytics to get competitive advantage.  A quick investigation of their data and their systems soon showed that a large section of the data they were using was either out-of-date, badly formatted or just erroneous.

Today we are blessed, and perhaps at the same time cursed, by the sheer number of solutions available that promise to make it easy for marketers to get their message out to an audience, and yet many of them do not help the fundamental issue of ensuring good data quality.  And while there are also umpteen services that will append records to or automatically populate fields in your database for you, professionals are then dependent on them to offer good quality which is not always guaranteed.  As I have stated before, data is such a commodity that it is bought, sold and peddled around the world like mad.  It can soon become stale or end up in the wrong fields as it is copied and pasted from Excel sheet to Excel sheet.  And yet, to turn it into an asset, you need to be able to get good insights and take a right action as a result.

I would also argue that very few businesses run manual data quality checks on their data.  And yet they will spend thousands of dollars on maintaining servers and systems that are essentially still full of a large amount of poor data. It’s like owning a fast sports car, washing and polishing it every morning to make it look its best, and yet not service the car, leave old oil in it and fill it with the cheapest, most underperforming fuel.  What’s the point of that?

The reasons why pristine data counts for a lot are numerous.  First, you’re not falling at the first hurdle.  There’s no sense in crafting the best advertising campaign if the audience you’re sending it to just doesn’t exist or doesn’t see it.  Moreover, for those in marketing, good data means potentially great leads for the sales team to follow up on.  Nothing annoys a good salesperson more than non-existent leads or those of a poor quality.  As a result, conversion rates will go up and you won’t need to keep pumping your database with replacement data.

Second, good data means less time spent hunting around for the right phone number or email or mailing address.  How many times have you seen customer records with badly formatted phone numbers or erroneous email addresses?  The importance here is also on the recognition that systems set up to accept US mailing addresses or phone numbers must also work beyond the borders of North America.  Zip codes outside the US are not always numeric and telephone area codes are not always composed of three digits.  Having a system that does not truncate field values to fit a US-model is therefore imperative.

But the biggest reason why all companies should exercise good data quality is in the eyes of many business leader the most important: money.  Good data means less money wasted on poor campaigns and less money spent on trying to fix the issue later on.  Plus, you’ll spend less money on your staff hunting for the right data.  And what’s more, good data will lead to better conversion rates more quickly, so you can quickly find out what is working, what isn’t and then choose to spend your marketing and operational dollars, pounds, euros and yen more wisely on what counts and works first time.

Sure, you might argue that all this is good common sense, but I tell you that large chunks of poor data still exists in many systems out there in enterprises large and small.  It’s time to stop watching bad data rates just go up and start actively flushing out data in your system so you can then get the results you need to act on.

There are a few tricks here.  First, don’t be afraid to delete bad data.  I know many companies don’t like deleting data at all but what is the point of having your systems full of incorrect information.  Second, design systems that can capture and have the most pertinent information in them, nothing more.  It’s far better to have 15 fields that are all filled out correctly, than have 150 fields that no-one has the time nor the will to complete and therefore remain empty.  And third, learn quickly from your mistakes and adapt behaviours accordingly.  There is no point doing the same thing over and over again expecting different results.  Einstein once said that was the definition of insanity.  And businesses cannot afford to be labelled mad.

So, before you start on your big data journey, before you start joining data together and before you start wanting to analyze data to act on it, it’s time to make sure your data is healthy and kept in good shape.  If you want to run your business like a fast sports car, make sure you tune it, service it and give it the best care possible.  And that starts with good data quality.

Why not start today and talk to us at Actian – we’ll help you understand your systems and data and how you can get the most value from it in the shortest of timescales.