Data Intelligence

Why Data Privacy is Essential for Successful Data Governance

Actian Corporation

April 8, 2022

Data Privacy essential for Data Governance

Data Privacy is a priority for organizations that wish to fully exploit their data. Considered the foundation of trust between a company and its customers, Data Privacy is the pillar of successful data governance. Understand why in this article.

Whatever the sector of activity or the size of a company, data now plays a key role in the ability for organizations to adapt to their customers, ecosystem, and even competitors. The numbers speak for themselves! Indeed, in a study by Stock Apps, the global Big Data market was worth $215.7 billion in 2021 and is expected to grow 27% in 2022 to exceed $274 billion.

Companies are generating such large volumes of data that data governance has become a priority. Indeed, a company’s data is vital to identifying its target audiences, creating buyer personas, providing personalized responses to its customers, or optimizing the performance of its marketing campaigns. However, this is not the only issue. If data governance provides the possibility to create value with enterprise data assets, it also ensures the proper administration of data confidentiality, also known as Data Privacy.

Data Privacy vs. Data Security: Two Not-So-Very-Different Notions

Data Privacy is one of the key aspects of Data Security. Although different, they take part in the same mission: building trust between a company and its customers who want to entrust their personal data. 

On the one hand, Data Security is the set of means implemented to protect data from internal or external threats, whether malicious or accidental (strong authentication, information system security, etc.).

Data Privacy, on the other hand, is a discipline that concerns the treatment of sensitive data, not only personal data (also called PII for Personally Identifiable Information) but also other confidential data (certain financial data, intellectual property, etc.). Data Privacy is furthermore clearly defined in the General Data Protection Regulation (GDPR), which came into place in Europe in 2018 and has since helped companies redefine responsible and efficient data governance.

Data confidentiality has two main aspects. The first is controlling access to the data – who is allowed to access it and under what conditions. The second aspect of data confidentiality is to put in place mechanisms that will prevent unauthorized access to data.

Why is Data Privacy so Important?

While data protection is essential to preserve this valuable asset and to create the conditions for rapid data recovery in the event of a technical problem or malicious attack, data privacy addresses another equally important issue.

Consumers are suspicious of how companies collect and use their personal information. In a world full of options, customers who lose trust in one company can easily buy elsewhere. To cultivate trust and loyalty, organizations must make data privacy a priority. Indeed, consumers are becoming increasingly aware of data privacy. The GDPR has played a key role in the development of this sensitivity: customers are now very vigilant about the way their personal data is collected and used.

Because digital services are constantly developing, companies gravitate in a world of hyper-competition where customers will not hesitate to switch to a competitor if said company has not done everything possible to preserve the confidentiality of their data. This is the main reason why Data Privacy is so crucial.

Why is Data Privacy a Pillar of Data Governance?

Data governance is about ensuring that data is of sufficient quality and that access is managed appropriately. The company’s objectives are to reduce the risk of misuse, theft, or loss. As such, data privacy should be understood as one of the foundations of sound and effective data governance. 

Even if data governance embraces the data issue in a much broader way, it cannot be done without a perfect understanding of the levers to be used to ensure optimized data confidentiality.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Guide to Data Quality Management #4 – Data Catalog Contribution to DQM

Actian Corporation

April 4, 2022

Contribution Data Quality Management

Data quality refers to an organization’s ability to maintain the quality of its data in time. If we were to take some data professionals at their word, improving Data quality is the panacea to all our business woes and should therefore be the top priority.

We believe this should be nuanced: Data quality is a means amongst others to limit the uncertainties of meeting corporate objectives.

In this series of articles, we will go over everything data professionals need to know about Data Quality Management (DQM):

  1. The nine dimensions of data quality
  2. The challenges and risks associated with data quality
  3. The main features of Data Quality Management tools
  4. The data catalog contribution to DQM

A Data Catalog is not a DQM Tool

An essential element is that a data catalog should not be considered a Data Quality Management tool, per se.

First of all, one of the core principles at the heart of Data Quality is that controls should ideally take place in the source system. Running these controls solely in the data catalog – rather than at the source and the data transformation flow – increases the global cost of the undertaking.

Furthermore, a data catalog must be both comprehensive and less intrusive to facilitate its rapid deployment within the company. This is simply incompatible with the complex nature of data transformation and the multitude of tools used to carry out these transformations.

Lastly, a data catalog must remain a simple tool to understand and use.

How Does a Data Catalog Contribute to DQM?

While the data catalog isn’t a Data Quality tool, its contribution to the upkeep of Data Quality is nonetheless substantial. Here is how:

  • A data catalog enables data consumers to easily understand metadata and avoid hazardous interpretations of the data. It echoes the clarity dimension of quality;
  • A data catalog gives a centralized view of all the available enterprise data. Data Quality information is therefore metadata like any other that carries value and should be made available to all. They are easy to interpret and extract, an echo of the dimensions of accuracy, validity, consistency, uniqueness, completeness, and timeliness.
  • A data catalog has data traceability capacities (Data Lineage), echoing the traceability dimension of quality;
  • A data catalog usually allows direct access to the data sources, echoing the availability dimension of quality.

The Implementation Strategy of the DQM

The following table details how Data Quality is taken into account depending on the different solutions on the market:

As stated above, quality testing should by default take place directly in the source system. Quality test integration in a data catalog can improve user experience, but it isn’t a must in light of its limitations – as Data Quality isn’t integrated into the transformation flow.

That said, when the systems stacks become too complex and we need, for example, to consolidate data from different systems with different functional rules, a Data Quality tool becomes unavoidable.

The implementation strategy will depend on use cases and company objectives. It is nonetheless apropos to put Data Quality in place incrementally to:

  1. Ensure the source systems have put in place the relevant quality rules;
  2. Implement a data catalog to improve quality on the dimensions of clarity, traceability, and/or availability;
  3. Integrate Data Quality in the transformation flows with a specialized tool while importing this information automatically into the data catalog via APIs.

Conclusion

Data Quality refers to the ability of a company to maintain the sustainability of its data over time. We define it through the prism of nine of the sixty dimensions described by DAMA International: completeness, accuracy, validity, uniqueness, consistency, timeliness, traceability, clarity, and availability.

As a data catalog provider, we reject the idea that a data catalog is a full-fledged quality management tool. In fact, it is only one of several ways to contribute to the improvement of Data Quality, notably through the dimensions of clarity, availability, and traceability.

Get our Data Quality Management Guide for Data-Driven Organizations

For more information on Data Quality and DQM, download our free guide: “A Guide to Data Quality Management” now! Download the eBook.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Guide to Data Quality Management #3 – The Main Features of DQM Tools

Actian Corporation

April 3, 2022

Data Quality Management Tools Features

Data Quality refers to an organization’s ability to maintain the quality of its data over time. If we were to take some data professionals at their word, improving Data Quality is the panacea to all our business woes and should therefore be the top priority.

We believe this should be nuanced: Data Quality is a means amongst others to limit the uncertainties of meeting corporate objectives.

In this series of articles, we will go over everything data professionals need to know about Data Quality Management (DQM):

  1. The nine dimensions of Data Quality
  2. The challenges and risks associated with Data Quality
  3. The main features of Data Quality Management tools
  4. The Data Catalog contribution to DQM

One way to better understand the challenges of Data Quality is to look at the existing Data Quality solutions on the market.

From an operational point of view, how do we identify and correct Data Quality issues? What features do Data Quality Management tools offer to improve Data Quality?

Without going into too much detail, let’s illustrate the pros of a Data Quality Management tool through the main evaluation criteria of Gartner’s Magic Quadrant for Data Quality Solutions.

Connectivity

A Data Quality Management tool has to be able to gather and apply quality rules on all enterprise data (internal, external, on-prem, cloud, relational, non-relational, etc.). The tool must be able to plug into all relevant data in order to apply quality rules.

Data Profiling, Data Measuring, and Data Visualization

You cannot correct Data Quality issues if you cannot detect them first. Data profiling enables IT and business users to assess the quality of the data in order to identify and understand the Data Quality issues.

The tool must be able to carry out what is outlined in The Nine Dimensions of Data Quality to identify quality issues throughout the key dimensions for the organization.

Monitoring

The tool must be able to monitor the evolution of the quality of the data and warn management at a certain point.

Data Standardization and Data Cleaning

Then comes the data cleaning phase. The aim here is to provide data cleaning functionalities in order to enact norms or business rules to alter the data (format, values, page layout).

Data Matching and Merging

The aim is to identify and delete duplicates that can be present within or between datasets.

Address Validation

The aim is to standardize addresses that could be incomplete or incorrect.

Data Curation and Enrichment

The capabilities of a Data Quality Management tool are what enable the integration of data from external sources and improve completeness, thereby adding value to the data.

The Development and Putting in Place of Business Rules

The capabilities of a Data Quality Management tool are what enable the creation, deployment, and management of business rules, which can then be used to validate the data.

Problem Resolution

The quality management tool helps both IT and business users to assign, escalate, solve, and monitor Data Quality problems.

Metadata Management

The tool should also be capable of capturing and reconciling all the metadata related to the Data Quality process.

User-Friendliness

Lastly, a solution should be able to adapt to the different roles within the company, and specifically to non-technical business users.

Get our Data Quality Management Guide for Data-Driven Organizations

For more information on Data Quality and DQM, download our free guide: “A Guide to Data Quality Management” now! Download the eBook.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Guide to Data Quality Management #2 – The Challenges With Data Quality

Actian Corporation

April 2, 2022

The 9 Dimensions of Data Quality

Data Quality refers to an organization’s ability to maintain the quality of its data over time. If we were to take some data professionals at their word, improving Data Quality is the panacea to all our business woes and should therefore be the top priority. 

We believe this should be nuanced: Data Quality is one means, among others, to limit the uncertainties of meeting corporate objectives. 

In this series of articles, we will go over everything data professionals need to know about Data Quality Management (DQM):

  1. The nine dimensions of Data Quality
  2. The challenges and risks associated with Data Quality
  3. The main features of Data Quality Management tools
  4. The Data Catalog contribution to DQM

The Challenges of Data Quality for Organizations

Initiatives for improving the quality of data are typically implemented by organizations to meet conformity requirements and reduce risk. They are indispensable for reliable decision-making. There are, unfortunately, many stumbling blocks that can hinder Data Quality improvement initiatives. Below are some examples:

  • The exponential growth of the volume, speed, and variety of the data makes the environment more complex and uncertain.
  • Increasing pressure from conformity regulations such as GDPR, BCBS 239, or HIPAA.
  • Teams are increasingly decentralized, and each has its domain of expertise.
  • IT and data teams are snowed under and don’t have time to solve Data Quality issues.
  • The data aggregation processes are complex and long.
  • It can be difficult to standardize data between different sources.
  • Change audits among systems are complex.
  • Governance policies are difficult to implement.

Having said that, there are also numerous opportunities to grab. High-quality data enables organizations to facilitate innovation with artificial intelligence and ensure a more personalized customer experience. Assuming there is enough quality data. 

Gartner has actually forecasted that until 2022, 85% of AI projects will produce erroneous data as a result of bias in the data, algorithms, or from teams in charge of data management.

Reducing the Level of Risk by Improving the Quality of the Data

Poor Data Quality should be seen as a risk and quality improvement software as a possible solution to reduce this level of risk.

Processing a Quality Issue

If we accept the notion above, any quality issue should be addressed in several phases:

1. Risk Identification: This phase consists in seeking out, recognizing, and describing the risks that can help/prevent the organization from reaching its objectives – in part because of a lack of Data Quality.

2. Risk Analysis: The aim of this phase is to understand the nature of the risk and its characteristics. It includes factors for event similarities and their consequences, the nature, and importance of these consequences, etc. Here, we should seek to identify what has caused the poor quality of the marketing data. We could cite for example:

  • A poor user experience of the source system leading to typing errors;
  • A lack of verification of the completeness, accuracy, validity, uniqueness, consistency, or timeliness of the data;
  • A lack of simple means to ensure the traceability, clarity, and availability of the data;
  • The absence of a governance process and the implication for business teams.

3. Risk Evaluation: The purpose of this phase is to compare the results of the risk analysis with the established risk criteria. It helps establish whether further action is needed for the decision-making – for instance keeping the current means in place, undertaking further analysis, etc.

Let’s focus on the nine dimensions of Data Quality and evaluate the impact of poor quality on each of them:

The values for the levels of probability and severity should be defined by the main stakeholders, who know the data in question best. 

4. Risk Processing: This processing phase aims to set out the available options to reduce risk and roll them out. This processing also involves the ability to assess the usefulness of the actions taken, determining whether the residual risk is acceptable or not – and in this last case – consider further processing.

Therefore, improving the quality of the data is clearly not a goal in itself:

  • Its cost must be evaluated based on company objectives.
  • The treatments to be implemented must be evaluated through each dimension of quality.

Get our Data Quality Management Guide for Data-Driven Organizations

For more information on Data Quality and DQM, download our free guide: “A Guide to Data Quality Management” now! Download the eBook

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Guide to Data Quality Management #1 – The 9 Dimensions of Data Quality

Actian Corporation

April 1, 2022

The 9 Dimensions of Data Quality

Data Quality refers to an organization’s ability to maintain the quality of its data over time. If we were to take some data professionals at their word, improving Data Quality is the panacea to all our business woes and should therefore be the top priority. 

We believe this should be nuanced: Data Quality is one means, among others, to limit the uncertainties of meeting corporate objectives. 

In this series of articles, we will go over everything data professionals need to know about Data Quality Management (DQM):

    1. The nine dimensions of data quality
    2. The challenges and risks associated with data quality
    3. The main features of Data Quality Management tools
    4. The data catalog contribution to DQM

Some Definitions of Data Quality

Asking Data Analysts or Data Engineers for a definition of Data Quality will provide you with very different answers, even within the same company, amongst similar profiles. Some, for example, will focus on the unity of data, while others will prefer to reference standardization. You may have your interpretation.

The ISO 9000-2015 norm defines quality as “the capacity of an ensemble of intrinsic characteristics to satisfy requirements”.

DAMA International (The Global Data Management Community) – a leading international association involving both business and technical data management professionals – adapts this definition to a data context: “Data Quality is the degree to which the data dimensions meet requirements.”

The Dimensional Approach to Data Quality

From an operational perspective, Data Quality translates into what we call Data Quality dimensions, in which each dimension relates to a specific aspect of quality.

The 4 dimensions most often used are generally completeness, accuracy, validity, and availability. In literature, there are many dimensions and different criteria to describe Data Quality. There isn’t however any consensus on what these dimensions actually are.

For example, DAMA enumerates sixty dimensions – when most Data Quality Management (DQM) software vendors usually offer up five or six.

The Nine Dimensions of Data Quality

At Zeenea, we believe that the ideal compromise is to take into account nine Data Quality dimensions: completeness, accuracy, validity, uniqueness, consistency, timeliness, traceability, clarity, and availability.

We will illustrate these nine dimensions and the different concepts we refer to in this publication with a straightforward example:

Arthur is in charge of sending marketing campaigns to clients and prospects to present his company’s latest offers. He encounters, however, certain difficulties:

  • Arthur sometimes sends communications to the same people several times.
  • The emails provided in his CRM are often invalid.
  • Prospects and clients do not always receive the right content.
  • Some information pertaining to the prospects are obsolete.
  • Some clients receive emails with erroneous gender qualifications.
  • There are two addresses for clients/prospects but it’s difficult to understand what they relate to.
  • He doesn’t know the origin of some of the data he is using or how he can access their source.

Below is the data Arthur has at hand for his sales efforts. We shall use them to illustrate each of the nine dimensions of Data Quality:

1. Completeness

Is the data complete? Is there information missing? The objective of this dimension is to identify the empty, null, or missing data. In this example, Arthur notices that there are missing email addresses:

To remedy this, he could try and identify whether other systems have the information needed. Arthur could also ask data specialists to manually insert the missing email addresses.

2. Accuracy

Are the existing values coherent with the actual data, i.e., the data we find in the real world?

Arthur noticed that some letters sent to important clients are returned because of incorrect postal addresses. Below, we can see that one of the addresses doesn’t match the standard address formats in the real world:

It could be helpful here for Arthur to use postal address verification services.

3. Validity

Does the data conform with the syntax of its definition? The purpose of this dimension is to ensure that the data conforms to a model of a particular rule.

Arthur noticed that he regularly gets bounced emails. Another problem is that certain prospects/clients do not receive the right content because they haven’t been accurately qualified. For example, the email address annalincoln@apple isn’t in the correct format and the Client Type Customer isn’t correct.

To solve this issue, he could for example make sure that the Client Type values are part of a list of reference values (Customer or Prospect) and that email addresses conform to a specific format.

4. Consistency

Are the different values of the same record in conformity with a given rule? The aim is to ensure the coherence of the data between several columns.

Arthur noticed that some of his male clients complain about receiving emails in which they are referred to as Miss. There does appear to be an incoherence between the Gender and Title columns for Lino Rodrigez.

To solve these types of problems, it is possible to create a logical rule that ensures that when the id Gender is Male, the title should be Mr.

5. Timeliness

Is the time lapse between the creation of the data and its availability appropriate? The aim is to ensure the data is accessible in as short a time as possible.

Arthur noticed that certain information on prospects is not always up to date because the data is too old. As a company rule, data on a prospect that is older than 6 months cannot be used.

He could solve this problem by creating a rule that identifies and excludes data that is too old. An alternative would be to harness this same information in another system that contains fresher data.

6. Uniqueness

Are there duplicate records? The aim is to ensure the data is not duplicated.

Arthur noticed he was sending the same communications several times to the same people. Lisa Smith, for instance, is duplicated in the folder:

In this simplified example, the duplicated data is identical. More advanced algorithms such as Jaro, Jaro-Winkler, or Levenshtein, for example, can regroup duplicated data more accurately.

7. Clarity

Is understanding the metadata easy for the data consumer? The aim here is to understand the significance of the data and avoid interpretations.

Arthur has doubts about the two addresses given as it is not easy to understand what they represent. The names Street Address 1 and Street Address 2 are subject to interpretation and should be modified, if possible.

Renaming within a database is often a complicated operation and should be correctly documented with at least one description.

8. Traceability

Is it possible to obtain traceability from data? The aim is to get to the origin of the data, along with any transformations it may have gone through.

Arthur doesn’t really know where the data comes from or where he can access the data sources. It would have been quite useful for him to know this as it would have ensured the problem was fixed at the source. He would have needed to know that the data he is using with his marketing tool originates from the data of the company data warehouse, itself sourced from the CRM tool.

9. Availability

How can the data be consulted or retrieved by the user? The aim is to facilitate access to the data.

Arthur doesn’t know how to easily access the source data. Staying with the previous schema, he wants to effortlessly access data from the data warehouse or the CRM tool.

In some cases, Arthur will need to make a formal request to access this information directly.

Get our Data Quality Management Guide for Data-Driven Organizations

For more information on Data Quality and DQM, download our free guide: “A Guide to Data Quality Management”.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Management

Zen Edge Database and Ado.net on Raspberry Pi

Actian Corporation

March 31, 2022

data management words on a laptop screen

Do you have a data-centric Windows application you want to run at the Edge? If so, this article demonstrates an easy and affordable way to accomplish this by using the Zen Enterprise Database through Ado.net on a Raspberry Pi. Raspberry Pi features a 64-bit ARM processor, can accommodate several operating systems, and cost around $50 (USD).

These instructions use Windows 11 for ARM64 installed on a Raspberry Pi V4 with 8 GB RAM for this example. (You could consider using Windows 10 (or another ARM64-based board), but you would first need to ensure Microsoft supports your configuration.)

Here are the steps and results as follows.

  • Use the Microsoft-installed Windows emulation with Windows 11. ARM64bit for Windows 11 installer
  • After the installer finishes, the Windows 11 directory structure should look like the figure below:

  • The installer creates Arm, x86, and x64bit directories for windows simulation.
  • Next, run a .Net Framework application using Zen ADO.NET provider on Windows 11 for ARM64 bit on Raspberry Pi.

Once the framework has been established, create an ADO.NET application using VS 2019 on a Windows platform where Zen v14 was installed and running.

To build the simple application, use a C# Windows form application, as seen in the following diagram.

Name and configure the project and point it to a location on the local drive (next diagram).

Create a form and add two command buttons and text boxes. Name it “Execute” and “Clear,” and add a DataGridView as follows.

Add Pervasive.Data.SqlClient.dll under project solution references by selecting the provider from C:Program Files (x86)ActianZenbinADONET4.4 folder. Add a “using” clause in the program code as

using Pervasive.Data.SqlClient;.

Add the following code under the “Execute” button.

Add the following code under the “Clear” button.

Then, add the connection information and SQL statement to the text boxes added in the previous steps as follows.

Zen Edge

Now the project is ready to compile, as seen below.

Use a “localhost” in the connection string to connect to the local system where the Zen engine is running. This example uses the Demodata database “class” table to select data.

Se “Execute” will then return the data in the Grid as follows.

Now the application is ready to be deployed on Raspberry Pi. To do so, all copy the “SelectData.Exe” from the C:testSelectDataSelectDatabinDebug folder and Zen ADO.NET provider “Pervasive.Data.SqlClient.dll “. Copy it to a folder on Windows 11 for ARM64bit on Raspberry Pi.

Next, register the ZEN ADO.NET provider to the GAC using Gacutil as follows.

Gacutil /f /I <dir>Pervasive.Data.SqlClient.dll

Zen Edge Database

Run the SelectData app and connect to a remote server where ZEN engine is running as a client-server application.

Change the server name or IP address in the connection string to your server where the Zen V14 or V15 engine is running.

Now the Windows application is running in the client-server using Zen Ado.net provider on a Raspberry Pi with Windows 11 for Arm64 bit installed.

And that’s it!  Following these instructions, you can build and deploy a data-centric Windows 11 application on a Raspberry Pi ARM64.  This or similar application can run on a client or server to upstream or downstream data clients such as sensors or other devices that generate or require data from an edge database.  Zen Enterprise uses standard SQL queries to create and manage data tables, and the same application and database will run on your Microsoft Windows-based (or Linux) laptops, desktops, or in the Cloud.  For a quick tutorial on the broad applicability of Zen, watch this video.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

What is the Difference Between Data Governance and Data Management?

Actian Corporation

March 31, 2022

Difference between Data Governance and Data Management

In a world where companies aspire to become data-driven, data management and data governance are concepts that must be mastered at all costs. Too often perceived as related or even interchangeable disciplines, the differences are important.

A company wanting to become data-driven must master the disciplines, concepts, and methodologies that govern the collection and use of data. Among those that are most often misunderstood are data governance and data management. 

On the one hand, data governance consists of defining the organizational structures of data – who owns it, who manages it, who exploits it, etc. On the other hand, data governance is about policies, rules, processes, and monitoring of indicators that allow for a sound administration of data throughout its life cycle (from collection to deletion).

Data management can therefore be defined as the technical application of the recommendations and measures defined by data governance.

Data Governance vs. Data Management: Their Different Missions

The main difference between data governance and data management is that the former has a strategic dimension, while the latter is rather operational.

Without data governance, data management cannot be efficient, rational, or sustainable. Indeed, data governance that is not restated into appropriate data management will remain a theoretical document or a letter of intent that will not allow you to actively and effectively engage in data-driven decision-making.

To understand what is at stake, it is important to understand that all the disciplines related to data are permanently overlapping and interdependent. Data governance is a conductor that orchestrates the entire system. It is based on a certain number of questions such as:

  • What can we do with our data?
  • How do we ensure data quality?
  • Who is responsible for the processes, standards, and policies defined to exploit the data?

Data management is the pragmatic way to answer these questions and make the data strategy a reality. Data management and data governance can and should work in tandem. However, data governance is mainly concerned with the monitoring and processing of all the company’s data, while data management is mainly concerned with the storage and retrieval of certain types of information.

Who are the Actors of Data Governance and Management?

At the top management level, the CEO is naturally the main actor in data governance, as they are its legal guarantor. But they are not the only one who must get involved.

The CIO (Chief Information Officer) plays a key role in securing and ensuring the availability of the infrastructure. However, constant access to data is crucial for the business (marketing teams, field salespeople) but also for all the data teams who are in charge of the daily reality of data management.

It is then up to the Chief Data Officer (CDO) to create the bridge between these two entities and break down the data silos in order to build agile data governance. He or she facilitates access to data and ensures its quality in order to add value to it.

And while the Data Architect will be more involved in data governance, the Data Engineer will be more involved in data management. As for the Data Steward, he or she is at the confluence of the two disciplines.

How Combining the Two Roles Helps Companies Become Data-Driven

Despite their differences in scope and means, the concepts of data governance and data management should not be opposed. In order for a company to adopt a data-driven strategy, it is imperative to reconcile these two axes within a common action. To achieve this, an organization’s director/CEO must be the first sponsor of data governance and the first actor in data management.

It is by communicating internally with all the teams and by continuously developing the data culture among all employees that data governance serves the business challenges while preserving a relationship of trust that unites the company with its customers.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

5 Product Values That Strengthen Team Cohesion & Experience

Actian Corporation

March 14, 2022

5 Product Values that Strangthen Zeeneas Team cohension customer experience

To remain competitive, organizations must make decisions quickly, as the slightest mistake can lead to a waste of precious time in the race for success. Defining the company’s reason for being, its direction, and its strategy makes it possible to build a solid foundation for creating an alignment – subsequently facilitating decisions that impact product development. Aligning all stakeholders in product development is a real challenge for Product Managers. Yet, it is an essential mission to bring up a successful product and an obvious prerequisite to motivate teams who need to know why they get up each morning to go to work.

The Foundations of a Shared Product Vision Within the Company

Various frameworks (NorthStar, OKR, etc.) have been developed over the last few years to enable companies and their product teams to lay these foundations, disseminate them within the organization, and build a roadmap that creates cohesion. These frameworks generally define a few key artifacts and have already given rise to a large body of literature. Although versions may differ from one framework to another, the following concepts are generally found:

  • Vision: The dream, the true North of a team. The vision must be inspiring and create a common sense of purpose throughout the organization.
  • The Mission: It represents an organization’s primary objective and must be measurable and achievable.
  • The Objectives: These define measurable short and medium-term milestones to accomplish the mission.
  • The Roadmap: A source of shared truth – it describes the vision, direction, priorities, and progress of a product over time.

With a clear and shared definition of these concepts across the company, product teams have a solid foundation for identifying priority issues and effectively ordering product backlogs.

Product Values: The Key to Team Buy-in and Alignment Over Time

Although well-defined at the beginning, these concepts described above can nevertheless fall into oblivion after a while or become obsolete! Indeed, the company and the product evolve, teams change, and consequently the product can lose its direction… A work of reconsideration and acculturation must therefore be carried out continuously by the product teams in order for it to last.

Indeed, product development is both a sprint and a marathon. One of the main difficulties for product teams is to maintain this alignment over time. In this respect, another concept in these frameworks is often under-exploited when it is not completely forgotten by organizations: product values.

Jeff Steiner, Executive Chairman at LinkedIn, particularly emphasized the importance of defining company values through the Vision to Values framework. LinkedIn defines values as “The principles that guide the organization’s day-to-day decisions; a defining element of your culture”. For example “be honest and constructive”, “demand excellence”, etc.

Defining product values in addition to corporate values can be a great way for product teams to create this alignment over time and this is exactly what we do at the Actian Data Intelligence Platform.

From Corporate Vision to Product Values: A Focus on a Data Catalog

Organization & Product Consistency

We have a shared vision – “Be the first step of any data journey” – and a clear mission – “To help data teams accelerate their initiatives by creating a smart & reliable data asset landscape at the enterprise level”.

We position ourselves as a data catalog pure-player and we share the responsibility of a single product between several Product Managers. This is why we have organized ourselves into feature teams. This way, each development team can take charge of any new feature or evolution according to the company’s priorities, and carry it out from start to finish.

If we prioritize the backlog and delivery by defining and adapting our strategy and organization according to the objectives, three problems remain:

  • How do we ensure that the product remains consistent over time when there are multiple pilots onboard the plane?
  • How do we favor one approach over another?
  • How do we ensure that a new feature is consistent with the rest of the application?

Indeed, each product manager has his or her own sensitivity, his or her own background. And if the problems are clearly identified, there are usually several ways to solve them. This is where product values come into play…

Actian Data Intelligence Platform’s Product Values

If the vision and the mission help us to answer the “why?”, the product values allow us to remain aligned with the “how?”. It is a precious tool that challenges the different possible approaches to meet customer needs. And each Product Manager can refer to these common values to make decisions, prioritize a feature or reject it, and ensure a unified & unique user experience across the product.

Thus, each new feature is built with the following 5 product values as guides:

Simplicity

This value is at the heart of our convictions. The objective of a Data Catalog is to democratize data access. To achieve this, facilitating catalog adoption for end users is key. Simplicity is clearly reflected in the way each functionality is proposed. Many applications end up looking like Christmas trees with colored buttons all over the place that no one knows how to use; others require weeks of training before the first button is clicked. The use of the Data Catalog should not be reserved to experts and should therefore be obvious and fluid regardless of the user’s objective. This value was reflected in our decision to create two interfaces for our Data Catalog: one dedicated to search and exploration, and the other for the management and monitoring of the catalog’s documentation.

Empowering

Documentation tasks are often time-consuming and it can be difficult to motivate knowledgeable people to share and formalize their knowledge. In the same way, the product must encourage data consumers to be autonomous in their use of data. This is why we have chosen not to offer rigid validation workflows, but rather a system of accountability. This allows Data Stewards to be aware of the impacts of their modifications. Coupled with an alerting and auditing system after the fact, it ensures better autonomy while maintaining traceability in the event of a problem.

Reassuring

It is essential to allow end-users to trust in the data they consume. The product must therefore reassure the user by the way it presents its information. Similarly, Data Stewards who maintain a large amount of data need to be reassured about the operations for which they are responsible: have I processed everything correctly? How can I be sure that there are no inconsistencies in the documentation? What will really happen if I click this button? What if it crashes? The product must create an environment where the user feels confident using the tool and its content. This value translates into preventive messages rather than error reports, a language type, idempotency of import operations, etc.

Flexibility

Each client has their own business context, history, governance rules, needs, etc. The data catalog must be able to adapt to any context to facilitate its adoption. Flexibility is an essential value to enable the catalog to adapt to all current technological contexts and to be a true repository of data at enterprise level. The product must therefore adapt to the user’s context and be as close as possible to their uses. Our flat and incremental modeling is based on this value, as opposed to the more rigid hierarchical models offered on the market.

Deep Tech

This value is also very important in our development decisions. Technology is at the heart of our product and must serve the other values (notably simplicity and flexibility). Documenting, maintaining, and exploiting the value of enterprise-wide data assets cannot be done without the help of intelligent technology (automation, AI, etc.). The choice to base our search engine on a knowledge graph or our positioning in terms of connectivity are illustrations of this “deep tech” value.

The Take Away

Creating alignment around a product is a long-term task. It requires Product Managers – in synergy with all stakeholders – to define from the very beginning: the vision, the mission, and the objectives of the company. This enables product management teams to effectively prioritize the work of their teams. However, to ensure the coherence of a product over time, the definition and use of product values  are essential. With the Actian Data Intelligence Platform, our product values are simplicity, autonomy, trust, flexibility and deep-tech. They are reflected in the way we design and enhance our Data Catalog and allow us to ensure a better customer experience over time.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Security

Hybrid Cloud Security

Actian Corporation

March 7, 2022

Hybrid Cloud Security padlock

One of the biggest fears of cloud adoption is the security of organizational data and information. IT security has always been an issue for all organizations, but the thought of not having total control over corporate data is frightening. One of the factors for organizations not moving everything to the cloud and adopting a hybrid cloud approach is security concerns. Hybrid cloud security architectures still have security risks related to a public cloud; however, hybrid cloud risks are higher simply because there are more clouds to protect. The trust boundary is extended beyond the organization for access to its essential critical data with hybrid cloud architectures.

Sensitive data can be kept off the public cloud to help manage risk. Doing so today may be helpful, but hybrid cloud solutions are integrations between public and private clouds. This integration without the appropriate security could still make your private cloud solution vulnerable to attacks originating from the public cloud. Secure hybrid clouds have significant benefits to organizations today. Along with the great benefits of the cloud are the negative aspects and challenges faced with securing the organizations’ data. The negative aspects are continually being addressed to help realize the incredible benefits that hybrid cloud architectures can provide for organizations today.

What is Hybrid Cloud Security?

Organizational IT infrastructures have increased in complexity, especially with hybrid cloud implementations. This complexity, combined with the benefits of cloud having characteristics of broad network access, and on-demand everywhere access capabilities, complicates how securing a hybrid cloud can be done. Securing the data, applications, and infrastructure internally and externally from hackers’ malicious adversary tactics and inadvertent, unintentional activities are compounded.

Many cloud vendors have adopted industry compliance and governance security standards, especially those created by the US government, to ease the security threats and risks that an organization may experience in the cloud. The Federal Risk and Authorization Program (FedRAMP) provides standards and accreditations for cloud services. The Security Requirement Guide (SRG) provides security controls and requirements for cloud service in the Department of Defense (DOD). These standards and others help cloud vendors and organizations improve their hybrid cloud security.

Securing the cloud, an organization should consider the cloud architecture components that consist of applications, data, middleware, operating systems, virtualization, servers, storage, and networking components. Security concerns are specific to the service type. Organizations have a shared responsibility with the cloud service provider for security with hybrid cloud security.

The responsibility for hybrid cloud security should include specific disciplines. Some essential discipline areas for managing risk and securing hybrid cloud are:

  • Physical controls to deter intruders and create protective barriers to IT assets are just as important as cybersecurity for protecting assets.
    • Security parameters, cameras, locks, alarms.
    • Physical controls can be seen as the first line of defense for protecting organizational IT assets. Not only from security threats but from overall harm from environmental challenges.
    • Biometrics (one or more fingerprints, possibly retina-scans) where system access ties to extremely sensitive data.
  • Technical controls.
    • Cloud patching fixes vulnerabilities in software and applications that are targets of cyber-attacks. Besides overall keeping systems up to date, this helps reduce security risk for hybrid cloud environments.
    • Multi-tenancy security each tenant or customer is logically separated in a cloud environment. This means each tenant has access to the cloud environment, but the boundaries are purely virtual, and hackers can find ways to access data across virtual boundaries if resources are improperly assigned and data overflows from one tenant can impinge on another. Data must be properly configured and isolated to avoid interference between tenants.
    • Encryption is needed for data at rest and data in transit. Data at rest is sitting in storage, and data in transit, going across the network and the cloud layers (SaaS, PaaS, IaaS). Both have to be protected. More often than not, data at rest isn’t encrypted because it’s an option that is not turned on by default.
    • Automation orchestration is needed to remove slow manual responses for hybrid cloud environments. Monitoring, checking for compliance, appropriate responses, and implementations should be automated to eliminate human error. These responses should also be reviewed and continuously improved.
    • Access controls – People and technology accesses should always be evaluated and monitored on a contextual basis including date, time, location, network access points, and so forth. Define normal access patterns and monitor for abnormal patterns and behavior, which could be an alert to a possible security issue.
    • Endpoint security for remote access has to be managed and controlled. Devices can be lost, stolen, or hacked, providing an access point into a hybrid cloud and all of its data and resources. Local ports on devices that allow printing or USB drives would need to be locked for remote workers or monitored and logged when used.
  • Administrative controls to account for human factors in cloud security.
    • Zero trust architecture (ZTA), principles and policy continually evaluate trusted access to cloud environments to restrict access for only minimum privileges. Allowing too much access to a person or technology solution can cause security issues. Adjustments to entitlements can be made in real-time, for example, is a user suddenly downloading far more documents? Are those documents outside his or her normal scope of work or access?  Of course, this requires data governance that includes tagging and role-based access that maps entitlements to tagging.
    • Disaster recovery – Performing business impact analysis (BIA) and risk assessments are crucial for performing disaster recovery and deciding how hybrid cloud architectures should be implemented. Including concerns related to data redundancy and placement within a cloud architecture for service availability and rapid remediation post attack.
    • Social engineering education and technical controls for phishing, baiting, etc. Social engineering is an organizational issue and a personal issue for everyone.  Hackers can steal corporate data and personal data to access anything for malicious purposes.
    • A culture of security is critical for organizations. The activities of individuals are considered one the most significant risk to the organization. Hackers target their access to any organization through the organization’s employees as well as partners and even third-party software vendors and services contractors. The employees, contractors, and partners need to be educated continuously to help avoid security issues that can be prevented with training and knowledge.
  • Supply chain controls.
    • Software, infrastructure, and platform from 3rd parties have to be evaluated for security vulnerabilities. Software from a 3rd party supplier, when installed, could have security vulnerabilities or have been hacked that allow criminals complete access to an organization’s hybrid cloud environment. Be sure to check how all 3rd party software vendors approach and practice safe security controls over their products.

Security in the cloud is a shared responsibility that becomes more complex as deployments are added. Shared Services are a way to deliver functions such as security, monitoring, authorization, backups, patching, upgrades, and more in a cost-effective, reliable way to all clouds. Shared services reduce management complexity and are essential to achieve a consistent security posture across your hybrid cloud security architecture.

Configuration Management and Hybrid Cloud Security

Hybrid cloud security architecture risks are higher simply because there are more clouds to protect. For this reason, here are a few extra items that you should put on your hybrid cloud security best practices list, including visibility, shared services, and configuration management. First, you can’t secure what you can’t see. Hybrid cloud security requires visibility across the data center and private and public cloud borders to reduce hybrid cloud risks resulting from blind spots.

Another area to focus on is configuration management since misconfigurations are one of the most common ways for digital criminals to land and expand in your hybrid cloud environments. Encryption isn’t turned on, and access hasn’t been restricted; security groups aren’t set up correctly, ports aren’t locked down. The list goes on and on. Increasingly, hybrid cloud security teams need to understand cloud infrastructure better to secure it better and will need to include cloud configuration auditing as part of their delivery processes.

One of the Hybrid cloud security tools that can be utilized is a Configuration Management System (CMS) using configuration management database (CMDB) technology as the foundation that can help organizations gain visibility into hybrid cloud configurations and the relationships of all cloud components. The first activity with a CMS involves discovering all cloud assets or configuration items that make up the services being offered. At this time, a snapshot of the environment is made with essential details of the cloud architecture. Once discovering their hybrid cloud architecture, many organizations immediately look for security concerns that violate security governance.

Once the CMS is in place, other hybrid cloud security tools such as drift management and monitoring changes in the cloud architecture can alert to cloud attacks. Once the unauthorized drift is detected, other automation tools to correct and alert can be implemented to counterattack the attack. The CMS and the CMDB support cloud security operations and other service management areas, such as incident, event, and problem management, to help provide a holistic solution for the organization’s service delivery and service support.

Conclusion

Security issues in hybrid cloud computing aren’t that different from security issues in cloud computing. You can review the articles on Security, Governance, and Privacy for the Modern Data Warehouse, Part 1 and Part 2, that provide a lot of pointers on how to protect your data and cloud services.

Hybrid cloud security risks and issues will be one of those IT organizational business challenges that will be around for a long time. Organizations need to stay informed and have the latest technologies and guidance for combating the hybrid cloud security issues and threats. This includes partnering with hybrid cloud solution providers such as Actian. It is essential for the organization’s ability to function with consistently changing cloud security needs.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Interview With Ruben Marco Ganzaroli – CDO at Autostrade per l’Italia

Actian Corporation

March 3, 2022

We are pleased to have been selected by Autostrade per l’Italia – a European leader among concessionaires for the construction and management of toll highways – to deploy the Actian Data Intelligence Platform’s data catalog at the group level. We took this opportunity to ask Ruben Marco Ganzaroli a few questions, who joined the company in 2021 as a Chief Data Officer to support the extensive Next to Digital program, to digitally transform the company. A program with a data catalog as its starting point.

Q: CDOs are becoming critical to a C-level team. How important is data to the strategic direction of Autostrade per l’Italia?

Data is at the center of the huge Digital Transformation program started in 2021, called « Next to Digital », which aims at transforming Autostrade per l’Italia into a Sustainable Mobility Leader. We wanted to protect whoever is traveling on our highways, execute decisions faster, as well as be agile and fluid. We not only want to react immediately to what is happening around us, but also to be able to anticipate events and take action before they occur. The Company was started in the early 50s – last century, and we realized that all the data we collected throughout the years could be a unique advantage and a strong lever to transform the company.

Q: What are the main challenges you want to address by implementing a data catalog in your organization?

We think that only the business functions of the Autostrade group can truly transform the company into a data-driven one. To do this, business functions need to be supported by the right tools – efficient and usable – and they must be fully aware of the data they have available. Ideas, and therefore value, are generated only if you have a clear idea of ​​the environment in which you are moving within, and the objective you are aiming for. If, without knowing it, you have a gold bar under your mattress, you will sleep uncomfortably and realize that you could do something to improve your situation – probably by changing mattresses, for example. However, if you are aware that you have that gold bar, you will lift the mattress, take the bar, and turn it into a jewel – maximizing its value.

The data catalog builds the bridge between business and data at Autostrade. It is the tool that allows business users to have knowledge on the fact that there are many gold bars available and to know where they can be found.

Q: What features were you looking for in a data catalog and that you found in the platform?

From a business perspective, a data catalog is the access point to all data. It must be fast, complete, easy to understand and user friendly, and represent a lever (not an obstacle). Business users must not be forced to spend the majority of their time on it. Whereas from an IT perspective, a data catalog must be agile, scalable, as well as quickly and continuously upgradeable as data is continuously being ingested or created.

Q: What is your vision of a data catalog in the data management solutions’ ecosystem?

We don’t think of the catalog as a tool, but as a part of the environment we need, as IT, to make available to the business functions. This ecosystem naturally includes tools, but what’s also important is the mindset of its users. To lead this mindset change, business functions must be able to work with data, and that’s the reason Self-BI is our main goal for 2022 as CDO Office. As mentioned previously, the catalog is the starting point for all of that. It is the door that lets the business in the data-room.

Q: How will you drive catalog adoption among your data teams?

All leaders from our team, Leonardo B. for the Data Product, Fulvio C. for Data Science, Marco A. and Andrea Q. for Data Engineering and Cristina M. as Scrum (super)Master are focused on managing the program. This program foresees an initial training phase for business users, an on-the-job dedicated support and an on-the-room support. Business users will participate in the delivery of their own analysis. We will onboard business functions incrementally, to focus the effort and maximize the effectiveness of each business function. The goal is to onboard all business functions within 2022: it represents a lot of work, but is made easier by knowing that there is a whole company behind that supports us and strongly believes that we are going in the right direction.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Architecture

Enterprise Data Warehouse

Teresa Wingfield

March 2, 2022

Enterprise Data Warehouse

Do you need a data warehouse or Enterprise Data Warehouse (EDW) in your organization today? Its functional units, people, services, and other organizational assets support the vision and mission of an organization. An organization can have strategies, tactics, and operational activities that are performed to meet the overall vision and mission of the organization for service to its customers. To be high-performing today requires utilizing the power of data supported by innovations in information technology. People and advanced technologies are making decisions every day to support the organization’s success. Data drives decisions.

Today, one of the essential tools is an Enterprise Data Warehouse to support effective decisions using a single source of truth across the organization. The organization has to work as a high-performing team, exchanging data, information, and knowledge for decisions, and the EDW plays a central role. As organizations move their IT to the Cloud, the EDW is also transforming and moving there as well, further improving the organization’s business decision-making.

What is an Enterprise Data Warehouse?

An Enterprise Data Warehouse is a central repository of data to support a value chain of business practice interactions between all functional units within the organization. Also known as a EDW Data Warehouse, where data is collected from multiple sources and normalized for data-driven insightful analytical decisions across the organization for the services and products delivered and supported for its customers. Data is transformed into information, information into knowledge, and knowledge into decisions for analytics and overall Business Intelligence (BI). Enterprise Data Warehouse Reporting capabilities take advantage of the EDW to provide the organization with needed business and customer insights.

Enterprise Data Warehouse architecture makes use of an Extract, Transform, and Load (ETL) process to ingest, consolidate, and normalize data for organizational use. The data is modeled based on decisions that need to be made by the stakeholders and put in a consistent format for usage and consumption with integrated technologies and applications. The organization’s sales, marketing, and other teams can use the EDW for an end-to-end perspective of the organization and the customers that are being serviced. Enablement of the EDW utilizing the cloud is a plus because of the power of cloud technologies today in making data accessible anywhere and anytime.

The basic Enterprise data warehouse requirements that make up the EDW system include but are not limited to the following components:

  • Data sources – databases, including a transactional database, and other files with various formats
  • Data transformation engine – ETL tools (typically an external third-party tool)
  • The EDW database repository itself
  • The database administration for creation, management, and deletion of data tables, views, and procedures
  • End-user tools to access data or perform analytics and business intelligence

Leveraging data from multiple data silos into one unified data repository that contains all business data is powerful. An EDW is a database platform of multidimensional business data that different parts of the organization can use. The EDW has current and historical information that can be easily modified, including the model to support changes in business needs. EDWs can support additional sources of data quickly without redesigning the system. As the organization learns how to use the data and gives feedback, the solution can transform rapidly to support the organization and the data stakeholders. As the organization matures, so can the data in the EDW rapidly mature.

Enterprise Data Warehouse vs. Data Mart

An Enterprise Data Warehouse becomes a single source of truth for organizational decisions that need collaboration between multiple functional areas in the organization. The EDW can be implemented as a one-tier architecture with all functional units accessing the data in the warehouse. EDW can also be implemented with the addition of Data Marts (DM). The difference between DM and EDW is the DW is much smaller and focused than an enterprise data warehouse. Enterprise Data Warehouse services are also for the entire organization, whereas a Data Mart is usually for a single line of business within the organization.

Data Marts contain domain or unique functional data, such as only sales data or marketing data. The data mart can extend the usage of the EDW using a two-tier architecture leveraging on-premise and/or the cloud capabilities that use the EDW as a source of data for specific use cases. Data marts typically involve integration from a limited number of data sources and focus on a single line of business or functional unit. The size of a data mart is in gigabytes versus terabytes for an EDW. Data Marts do not have to use an EDW as a data source but can use other sources specific to needs.

Organizations may want to use a data mart or multiple data marts to help increase the security of the EDW by limiting access to only domain-specific data through the data mart if using a two-tier architecture. An organization may also use the data mart to reduce the complexity of managing access to EDW data for a single line of business.

Choosing between EDW and a data mart does not have to be one or the other. Both are valuable. Remember, the outcome is to provide data for high performing decision support within the organization. EDW helps bring the bigger organization perspective for delivering and supporting business services. Data marts can complement the EDW to optimize performance and data delivery. Overall, enterprise-wide performance for decisions, reporting, analytics, and business intelligence is best done with a solution that spans the organization. A complete end-to-end value view of customers, products, tactics, and operations that support the organizational vision and mission will benefit everyone in the organization, including the customers.

Data Marts are easier and quicker to deploy than an EDW and cost less. A line of business can derive value quickly with a solution that can be deployed faster with a limited scope, fewer stakeholders, less modeling, and integration complexity than an EDW. The data mart will be designed specifically for that line of business to support their ability to work in a coordinated, collaborative way within their function. This can help create a competitive advantage against competitors by enabling better data analytics for decision support within a specific line of business or functional unit.

Enterprise Data Warehouse and the Cloud

Cloud Enterprise Data Warehouse (EDW) takes advantage of the value of the cloud in the same manner as many other cloud services that are becoming the norm for many organizations. The EDW itself may be better suited to reside in the cloud instead of on-premise. The cloud provides:

  • The flexibility to build out and modify services in an agile manner.
  • The potential to scale almost infinitely.
  • The assurance of enhanced business continuity.
  • The ability to avoid capital expenditures (CapEx).

Organizations can still choose to architect hybrid-cloud solutions for EDW that take advantage of on-premise organizational capabilities and vendor cloud capabilities. EDW should be planned using expertise focused on organizational constraints and business objectives for best long-term solutions that can take advantage of the ease of use with continuous improvement of the EDW solution. This use of expertise includes using Data Marts in the solution for maximum benefit to the organization.

Conclusion

EDWs architecture can be challenging for bringing the organization’s data into one database, especially all simultaneously. Organizations should design for the big picture and deploy incrementally, starting with specific business challenges or specific lines of business. This will create patterns of success for improving the next increment. This will also help with the faster delivery of a solution that can benefit the organization without the complete solution being finished.

In many instances, an organization can’t simply rely on silos of line-of-business data marts. They need enterprise data warehouse reporting to get a complete view of customers, products, operations, and more to make decisions that best benefit the whole company. Yes, enterprise data warehouse architectures can be painful. In most instances, you can deploy incrementally, starting with specific domains or business challenges. This will help you deliver value faster and evolve into the holistic purpose your EDW is intended to serve.

The power of having a cross-organizational repository of meaningful data to enable better decision-making and overall better service delivery and support for the customer outweigh the challenges with the architecture. An organization that does this successfully will gain improved marketability, sales, and overall better relationships with its customers. The business data insights will also enable the organization to position its internal assets more appropriately based on the improvements in data insights, analytics, and business intelligence.

Managing and utilizing data for an organization has to be done effectively, efficiently, and economically for value. Data is the organization’s lifeblood that supports the long-term viability of the organization itself. An organization that is not informed and does not view data as a point of contention for business service performance and decisions may find themselves optional in the marketplace. An EDW can help with the organization’s current and future business decision needs.

teresa user avatar

About Teresa Wingfield

Teresa Wingfield is Director of Product Marketing at Actian, driving awareness of the Actian Data Platform's integration, management, and analytics capabilities. She brings 20+ years in analytics, security, and cloud solutions marketing at industry leaders such as Cisco, McAfee, and VMware. Teresa focuses on helping customers achieve new levels of innovation and revenue with data. On the Actian blog, Teresa highlights the value of analytics-driven solutions in multiple verticals. Check her posts for real-world transformation stories.
Data Management

5 Uses Cases for Hybrid Cloud Data Management

Actian Corporation

February 25, 2022

Hybrid Cloud Data Management

With the rise of cloud computing, many organizations are opting to use a hybrid approach to their data management. Even though many companies still rely on on-premises storage, the benefits of having cloud storage as a backup or disaster recovery plan can be significant. This post will give you five of the most popular use cases for hybrid cloud data management.

Why Hybrid Cloud Data Management?

Hybrid cloud data management isn’t a new concept, but it’s finally starting to hit its stride as a viable option for enterprise data management.  It utilizes a mixture of on-premises and cloud storage and cloud computing to handle all aspects of a company’s data needs. Often, it’s the merger of on-premises databases or enterprise data warehouses (EDW) with cloud storage, SaaS application data and/or a cloud data warehouse (CDW). The benefits of this hybrid approach are twofold: it provides a backup plan for disaster recovery situations, and it gives an organization the ability to scale up as needed without purchasing additional hardware.

Backup and Disaster Recovery

One of the most obvious benefits of hybrid cloud data management is that it provides a backup for your data. If your on-premises storage system fails or you lose some important data, you can rely on your cloud storage to get it back. It will act as an additional fail-safe plan in case anything happens to your on-site server.

Data Accessibility

Data is not just one homogeneous entity. Many companies can feel hampered by data access. They may not have the in-house expertise or budget to handle the IT demands of data storage and real-time access. Through a hybrid cloud environment, your business can access data and applications stored in both on-premises and off-site locations. Global companies can store data closer to applications or users to improve processing time and reduce latency without having to have local data centers or infrastructure.

Data Analytics

Currently, many businesses are combining internal data sources with external data sources from partners or public sources for improved data analytics. A hybrid data warehouse can allow data teams to combine this third-party data with internal data sources to gain greater insights for decision-making. Data engineers can reduce the amount of effort required to source and combine data needed for users to explore new analytical models.

Data Migration

When an organization migrates their storage to the cloud, they can take advantage of public, private, and hybrid cloud solutions. This means utilizing a host of services, including backup storage, disaster recovery solutions, analytics, and more. All while paying less money on infrastructure costs and avoiding large capital expenses.

Data Compliance

The adoption of a hybrid data warehouse can relieve some of the compliance burdens that can often accompany stored data. For example, retired systems may leave behind orphaned databases, often with useful, historic data. This can create a data gap for analytic teams, but it can also pose a security and compliance risk for the business. Cloud service providers have teams of experts that work with governments and regulators globally to develop standards for things such as data retention times and security measures. Additionally, leveraging the cloud for data storage can also help address the challenges of data residency and data sovereignty regulations, which can become complex as data moves across geographical boundaries.

Regardless of where you are on your cloud journey, data is the most valuable asset to any organization. The cloud is an increasingly important component as businesses look for ways to leverage their data assets to maintain competitive advantage. Learn more about how the Actian Data Platform is helping organizations unlock more value from their data.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.