The previous blogs in this series discussed the top 5 pitfalls of traditional data warehouses and defined the Operational Data Warehouse (ODW) as a potential solution. Below is a list of my Top 10 desirable benefits of an effective ODW:
- Current: Having the data being continuously updated by micro batches or streamed singleton updates throughout the day provides the most current information for analytics-based decision making.
- Fast: Changes to an ODW data need to be made with the lowest performance penalty. Columnar data blocks that maintain their min-max value metadata eliminate the overhead of creating indexes that need to be updated with every change, as traditional row-based databases do.
- Scalable: Scalability needs to be provided by an ODW in two dimensions. Vertical scalability enables workloads to take advantage of more CPU and storage capacity on a single system. When you have saturated the hardware capacity of a single system, the ability to scale-out to a cluster of systems provides the ability to grow the ODW to handle larger databases and more users.
- Secure: The increasing level of cybercrime and regulation of data privacy means that even “internal” systems must be secured. A good ODW needs to offer built-in support for advanced encryption, auditing, role-based security and data masking.
- Flexible: The days when an organization could standardize on a single platform are over. The ODW needs to offer the flexibility to be deployed on-premises (on Linux, Windows, Hadoop Clusters) or in the cloud (on AWS, Microsoft Azure and beyond).
- Consistent: Some databases sacrifice query integrity for speed. A good ODW needs to provide row-level locking and full read consistency for running queries even as the underlying data changes.
- Robust Deliver enterprise-level resiliency and manageability. This translates to having solid back-up, recovery, failover and replication capabilities for the ODW.
- Economical: The total cost of ownership for a specific database technology being used to support a particular business case can be impacted by several factors. One is the ability to run standard servers to avoid esoteric appliances. Others include offering flexible deployment models to match different business needs, flexibility to scale up and down according to performance requirements, and the option to use different sized components (compute, storage) to optimize operating efficiencies.
- Interoperable: A good ODW needs to provide open API’s such as ODBC and ANSI SQL to enable to work with the multitude of query tools an organization might use. Many use more than 20 different visualization and query tools.
- Connected: The ability to consume of ingest data at high speed is a critical ODW requirement. If you cannot load your data in a reasonable time, the result is having to work with summary data or worse, using stale data.
I would be very interested to hear which benefits you value the most or others I could have included? Email me at Pradeep.firstname.lastname@example.org if you would like to share your views.