Over the past two decades, data warehouse solutions have evolved and diverged to address a myriad of use cases. Meanwhile, the pace of business continues to accelerate, making it harder to remain competitive. These new demands can stretch the abilities of traditional data warehouses. Below are 5 of the common pitfalls that can trip up traditional data warehouses.
As the demands for organizations to operate in real-time or in the moment increase, data warehouses need to deliver ever more current data. SQL Hadoop databases commonly fail to handle continuous streams of updates as the file system is optimized for infrequent batch updates, with a moving-window of historical data. Lack of current data can mean businesses fail to respond to threats and opportunities fast enough to stay competitive.
Ever-tightening privacy regulations such as GDPR and the increasing frequency of data breaches has made security a front-page reputational issue. Low-end databases can lack advanced encryption features for data in flight and at rest. Column-level data masking is an advanced capability many databases lack making this a serious pitfall.
There are many reasons that an analytics query can be slow. It could be that the DBA did not anticipate it and had not defined a specific index making that database unsuitable for ad-hoc queries. This problem is compounded by the current trend towards citizen data analysts, where users with a limited understanding of the underlying data structures can bring a database to its knees.
As data volumes and data types grow, adding capacity can become expensive. This is especially true of appliance bases solutions such as IBM Netezza where adding capacity can mean buying a bigger appliance. More open Hadoop- and cloud-based solutions that utilize commodity servers and operating systems have become popular to address the cost of the infrastructure, but have other hidden cost issues, such as requiring costly skills and lock-in.
Some databases just need a ton of database developer and administration skills. Oracle and Teradata fall into this camp. Cloud-based database services address this complexity to some extent, so there is hope.
Every organization has different priorities, so they may order these five pitfalls differently. Stay tuned for my next blog in the series on “What is an Operational Data Warehouse and why is it the next big thing?”, where I describe the next big thing in the Data Analytics, their benefits and much more.