Data Intelligence

Data Lakes: The Benefits and Challenges

Actian Corporation

June 24, 2021

data lakes: the pros and cons

Data Lakes are increasingly used by companies for storing their enterprise data. However, storing large quantities of data in a variety of formats can lead to data chaos! Let’s take a look at the pros and cons of Data Lakes.

To understand what a Data Lake is, let’s imagine a reservoir or a water retention basin that runs alongside the road. Regardless of the type of data, its origin, its purpose, everything, absolutely everything, ends up in the Data Lake. Whether that data is raw or refined, cleansed or not, all of this information ends up in this single place where it isn’t modified, filtered, or deleted before being stored.

Sounds a bit messy, doesn’t it? But that’s the whole point of the Data Lake.

It’s because it frees the data from any preconceived idea that a Data Lake offers real added value. How? By allowing data teams to constantly reinvent the use and exploitation of your company’s data.

Improvement of customer experience with a 360° analysis of the customer journey, detection of personas to refine marketing strategies, and rapid integration of new data flows from IoT, in particular, the Data Lake is an agile response to very structured problems for companies.

Data Lakes: The Undeniable Advantages

The first advantage of a Data Lake is that it allows you to store considerable volumes of protean data. Structured or unstructured, data from NoSQL databases…a Data Lake is, by nature, agnostic to the type of information it contains. It is precisely because it has no strict data exploitation scheme that the Data Lake is a valuable tool. And for good reason, none of the data it contains is ever altered, degraded, or distorted.

This is not the only advantage of a Data Lake. Indeed, since the data is raw, it can be analyzed on an ad-hoc basis.

The objective: to detect trends and generate reports according to business needs without it being a vast project involving another platform or another data repository. 

Thus, the data available in the Data Lake can be easily exploited, in real time, and allows you to place your company in a data centric scheme so that your decisions, your choices, and your strategies are never disconnected from the reality of your market or your activities.

Nevertheless, the raw data stored in your Data Lake can (and should!) be processed in a specific way, as part of a larger, more structured project. But your company’s data teams will know that they have, within reach of a click, an unrefined ore that can be put to use for further analysis.

The Challenges a Data Lake

When you think of a Data Lake, poetic mental images come to mind. Crystalline waves waving in the wind of success that carries you away…but beware! A Data Lake carries the seeds of murky, muddy waters. This receptacle of data must be the object of particular attention because without rigorous governance, the risk of sinking into a “chaos of data” is real.

In order for your Data Lake to reveal its full potential, you must have a clear and standardized vision of your data sources.

The control of these flows is a first essential safeguard to guarantee the good exploitation of data by heterogeneous nature. You must also be very vigilant about data security and the organization of your data.

The fact that the data in a Data Lake is raw does not mean that it should not have a minimum structure to allow you to at least identify and find the data you want to exploit.

Finally, a Data Lake often requires significant computing power in order to refine masses of raw data in a very short time. This power must be adapted to the volume of data that will be hosted in the Data Lake.

Between method, rigor and organization, a Data Lake is a tool that serves your strategic decisions.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

7 Lies of Data Catalogs #2: Not a Quality Solution

Actian Corporation

June 21, 2021

data quality

The Data Catalog market has developed rapidly, and it is now deemed essential when deploying a data-driven strategy. Victim of its own success, this market has attracted several players from adjacent markets.

 These players have rejigged their marketing positioning in order to present themselves as Data Catalog solutions.

The reality is that, while relatively weak on the data catalog functionalities themselves, these companies attempt to convince, with degrees of success proportional to their marketing budgets, that a Data Catalog is not merely a high-performance search tool for data teams, but an integrated solution likely to address a host of other topics.

The purpose of this blog series is to deconstruct the pitch of these eleventh-hour Data Catalog vendors.

A Data Catalog is NOT a Data Quality Management (DQM) Solution

Do not underestimate the importance of data quality in successfully delivering a data project, quite the contrary. It just seems absurd to me to put this in the hands of a solution, which by its very nature, cannot achieve the controls at the right time.

Let us explain: There is a very elementary rule to quality control, a rule that can be applied virtually in any domain where quality is an issue, be it an industrial production chain, software development, or the cuisine of a 5-star restaurant: The sooner the problem is detected, the less it will cost to correct.

To demonstrate the point, a car manufacturer is unlikely to refrain from testing the battery of a new vehicle until after its built and all the production costs have already been incurred and solving a defect would cost the most. No. Each piece is closely controlled, each step of the production is tested, defective pieces are removed before ever being integrated in the production circuit, and the entire chain of production can be halted if quality issues are detected at any stage. The quality issues are corrected at the earliest possible state of the production process where they are the least costly and the most durable.

“In a modern data organization, data production rests on the same principles. We are dealing with an assembly chain whose aim is to provide usage with high added value. Quality control and correction must happen at each step. The nature and level of controls will depend on what the data is used for.”

If you are handling data, you obviously have at your disposal pipelines to feed your uses. These pipelines can involve dozens of steps – data acquisition, data cleaning, various transformations, mixing various data sources, etc.

In order to develop these pipelines, you probably have a number of technologies at play, anything from in-house scripts to costly ETLs and exotic middleware tools. It’s within those pipelines that you need to insert and pilot your quality control, as early as possible, by adapting them to what is at stake with the end product. Only measuring data quality levels at the end of the chain isn’t just absurd, it’s totally inefficient.

It is therefore difficult to see how a Data Catalog (whose purpose is to inventory and document all potentially usable datasets in order to facilitate data discovery and usage) can be a useful tool to measure and manage quality.

A Data Catalog operates on available datasets, on any systems that contain data, and should be as least invasive as possible in order to be deployed quickly throughout the organization.

A DQM solution works on the data feed (the pipelines), focuses on production data and is, by design, intrusive and time consuming to deploy. I cannot think of any software architecture that can tackle both issues without compromising the quality of either one.

Data Catalog vendors promising to solve your data quality issues are, in our opinion, in a bind and it seems unlikely they can go beyond a “salesy” demo.

As for DQM vendors (who also often sell ETLs), their solutions are often too complex and costly to deploy as credible Data Catalogs.

The good news is that the orthogonal nature of data quality and data cataloging makes it easy for specialized solutions in each domain to coexist without encroaching on each other’s lane.

Indeed, while a data catalog isn’t purposed for quality control, it can exploit the information on the quality of the datasets it contains which obviously provides many benefits.

The Data Catalog uses this metadata for example to share the information (and possible alerts it may identify) with the data consumers. The catalog can benefit from this information to adjust his search and recommendation engine and thus, orientate other users towards higher quality datasets.

And both solutions can be integrated at little cost with a couple of APIs here and there.

Take Away

Data quality needs to be assessed as early as possible in the pipeline feeds.

The role of the Data Catalog is not to do quality control but to share as much as possible the results of these controls. By their natures, Data Catalogs are bad DQM solutions, and DQM solutions are mediocre and overly complex Data Catalogs.

An integration between a DQM solution and a Data Catalog is very straightforward and is the most pragmatic approach.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Events

Hybrid Data Conference Recap and Highlights

Traci Curran

June 17, 2021

hybrid data conference banner

That’s a Wrap!

Wow! What a wonderful time we had at the 2021 Hybrid Data Conference! Over two days, we showcased amazing demos, customer stories and technology advancements across the Actian portfolio. For those in attendance, we hope you enjoyed the event and the opportunity to see a few of the ways Actian is innovating and enabling our customers to gain greater value from their data at a fraction of the time and cost of other cloud data platforms.

For those who missed the event, here’s a quick recap of some of our most popular sessions.

Some of Our Favorite Sessions from the 2021 Hybrid Data Conference

Delivering on the Vision – Actian Hybrid Data Platform, presented by Emma McGrattan, Actian VP of Engineering

Emma McGrattan, Actian’s VP of Engineering gave an in-depth overview of how Actian products are delivering on the vision of hybrid cloud. Highlighting the Actian Data Platform, Emma showcased how Actian’s product portfolio is accelerating cloud adoption and changing the way customers advance along their cloud journey. If you’re looking to make the shift left right away or modernize and preserve investments in critical applications, this session is a great overview of many options and use cases to support your unique path to the cloud.

Actian on Google Cloud, Presented by Lak Lakshmanan, Google’s Director of Analytics

This brief 15 minute session presented by Lak Lakshmanan, Google’s Director of Analytics and AI Solutions, is a great intro in why Actian has chosen Google as our preferred cloud. We all love a better together story, but Lak shows provides a glimpse from the cloud provider perspective.

Of course, no conference would be complete without perspectives from our customers. Actian would like to thank all of the customers and partners that made the 2021 Hybrid Data Conference a success.

Actian Customer Panel Featuring Key Customer Speakers from Sabre, Finastra, and Goldstar Software

One Final Highlight

Greg Williams from Wired Mag Image

We were delighted to have Greg Williams, Editor-in-Chief for Wired deliver his thoughts on why data-driven insights are no longer optional in today’s modern world. Greg summarized it best in his presentation – every company is a data company.

Please visit the on-demand conference to hear more of his outstanding commentary on the future of data and how companies are creating advantage in a global economy.

Once again, we want to thank everyone that attended this year’s Hybrid Data Conference. We hope you found the networking and content valuable and we can’t wait to see you in 2022 – hopefully in person! Stay safe, and enjoy your summer!

Traci Curran headshot

About Traci Curran

Traci Curran is Director of Product Marketing at Actian, focusing on the Actian Data Platform. With 20+ years in tech marketing, Traci has led launches at startups and established enterprises like CloudBolt Software. She specializes in communicating how digital transformation and cloud technologies drive competitive advantage. Traci's articles on the Actian blog demonstrate how to leverage the Data Platform for agile innovation. Explore her posts to accelerate your data initiatives.
Data Intelligence

7 Lies of Data Catalog Providers #1: Not a Data Governance Solution

Actian Corporation

June 16, 2021

a data catalog is not a governance solution

The Data Catalog market has developed rapidly, and it is now deemed essential when deploying a data-driven strategy. Victim of its own success, this market has attracted a number of players from adjacent markets.

 These players have rejigged their marketing positioning to present themselves as Data Catalog solutions.

The reality is that, while relatively weak on the data catalog functionalities themselves, these companies attempt to convince, with degrees of success proportional to their marketing budgets, that a Data Catalog is not merely a high-performance search tool for data teams, but an integrated solution likely to address a host of other topics.

The purpose of this blog series is to deconstruct the pitch of these eleventh-hour Data Catalog vendors.

A Data Catalog is NOT a Data Governance Solution

This is probably our most controversial stance on the role of a Data Catalog and the controversy originates with the powerful marketing messages pumped out from the world leader in metadata management whose solution is in reality a data governance platform being sold as a Data Catalog.

To be clear, having sound data governance is one of the pillars of an effective data strategy. Governance, however, has little to do with tooling.

Its main purpose is the definition of roles, responsibilities, company policies, procedures, controls, committees. In a nutshell, its function is to deploy and orchestrate, in its entirety, the internal control of data in all its dimensions.

Let’s just acknowledge that data governance has many different aspects (processing and storage architecture, classification, retention, quality, risk, conformity, innovation, etc.) and that there aren’t any universal “one-size fits all” model adapted for all organizations. Like other governance domains, each organization must conceive and pilot its own landscape based on its capacities and ambitions, as well as thorough risk analysis.

Putting in place an effective data governance is not a project, but rather it is a transformation program.

No commercial “solution” can replace that transformation effort.

So Where Does the Data Catalog fit into All This?

The quest for a Data Catalog is usually the result of a very operational requirement: Once the Data Lake and a number of self-service tools are set up, the next challenge quickly becomes to find out what the Data Lake actually contains (both from a technical and a semantic perspective), where the data comes from, what transformations the data may have incurred, who is in charge of the data, what internal policies apply to the data, who is currently using the data and why etc.

An inability to provide this type of information to the end-user can have serious consequences to an organization, and a Data Catalog is the best means to mitigate that risk. When dealing with the selection of a transverse solution, involving people from many different departments, the selection of the solution is often given to those in charge of data governance, as they appear to be in the best position to coordinate the expectations of the largest number of stakeholders.

This is where the alchemy begins. The Data Catalog, whose initial purpose was to provide data teams with a quick solution to discover, explore, understand, and exploit the data, becomes a gargantuan project in which all aspects of governance have to be solved.

The project will be expected to:

  • Manage data quality.
  • Manage personal data and compliance (GDPR first and foremost).
  • Manage confidentiality, security, and data access.
  • Propose a new Master Data Management (MDM).
  • Ensure a field by field automated lineage for all datasets.
  • Support all the roles as defined in the system of governance and enable the relevant workflow configuration.
  • Integrate all the business models produced in the last 10 years for the urbanization program.
  • Authorize crossed querying on the data sources while complying with user habilitation on those same sources, as well as anonymizing the results.

Certain vendors manage to convince their client that their solution can be this unique one-stop-shop to data governance. If you believe this is possible, by all means call them, they will gladly oblige. But to be frank, we simply do not believe such a platform is possible, or even desirable. Too complex, too rigid, too expensive and too bureaucratic, this kind of solution can never be adapted to a data-centric organization.

For us, the Data Catalog plays a key role in a data governance program. This role should not involve supporting all aspects of governance but should rather be utilized to facilitate communication and awareness of governance rules within the company and to help each stakeholder become an active part of this governance.

In our opinion, a Data Catalog is one of the components that delivers the biggest return on investment in data-centric organizations that rely on Data Lakes with modern data pipelines…provided it can be deployed quickly and has a reasonable pricing associated with it.

Take Away

A Data Catalog is not a data governance management platform.

Data governance is essentially a transformation program with multiple layers that cannot be addressed by one single solution. In a data-centric organization, the best way to start, learn, educate, and remain agile is to blend clear governance guidelines with a modern Data Catalog that can share those guidelines with the end users.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Data Governance Framework | S03-E02 – Start in Under 6 Weeks

Actian Corporation

June 9, 2021

This is the last episode of our third and final season of “The Effective Data Governance Framework”.

Divided into two episodes, this final season will focus on the implementation of metadata management with a data catalog.

In this final episode, we will help you start a 3-6 week data journey and then deliver the first iteration of your Data Catalog.

Season 1: Alignment

Evaluate your Data maturity

Specify your Data strategy

Getting sponsors

Build a SWOT analysis

Season 2: Adapting

Organize your Data Office

Organize your Data Community

Creating Data Awareness

Season 3: Implementing Metadata Management with a Data Catalog

The importance of metadata

6 weeks to start your data governance journey

Metadata Governance Iterations

We are using an iterative approach based on short cycles (6 to 12 weeks at most) to progressively deploy and extend the metadata management initiative in the Data Catalog.

These short cycles make it possible to quickly obtain value. They also provide an opportunity to communicate regularly via the Data Community on each initiative and its associated benefits.

Each cycle is organized in predetermined steps, as follows:

1. Identify the Goal

A perimeter (data, people), a target.

2. Deploy / Connect

Technical configuration of scanners and ability to harvest the information.

Scanners deployed and operational.

3. Conceive and Configure

A metamodel tailored to meet expectations.

4. Import the Items

Define the core (minimum viable) information to properly serve the users.

5. Open and Test

Validate if the effort produced the expected value.

6. Measure the Gains

Fine grained analysis of the cycle to identify what worked, what didn’t and how to improve the next cycle.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Data Strategy: How to Break Down Data Silos

Actian Corporation

June 8, 2021

data-silos

Whether it comes from Product life cycles, marketing, or customer relations, data is omnipresent in the daily life of a company. Customers, suppliers, employees, partners… they all collect, analyze and exploit data in their own way.

The risk: The appearance of silos. Let’s discover why your data is siloed and how to put an end to it.

A company is made up of different professions that coordinate their actions to impose themselves on their market and generate profit. Each of these professions fulfill specific missions and collect data. Marketing, sales, customer success teams, communication…all of these entities act on a daily basis and base their actions on their own data.

The problem is that, over the course of their career, a customer will generate a certain amount of information.

A simple lead then becomes a prospect, who then becomes a customer. The same person may have different taxonomies based on which part of the business is analyzing this data.

This reality is what we call a data silo. In other words, data is poorly or never shared and therefore too often untapped. 

In a study by IDC entitled “The Data-Forward Enterprise” published in December 2020, 46% of French companies forecast a 40% annual growth in the volume of data to be processed over the next two years.

Nearly 8 out of 10 companies consider data governance to be essential. However, only 11% of them believe they are getting the most out of their data. The most common reason for this is data silos.

What are the Major Consequences of Data Silos?

Among the frequent problems linked to data silos, we find first and foremost the problem of duplicated data. Since data is used blindly by the business, what could be more natural?

These duplicates have unfortunate consequences. They distort the knowledge you can have of your products or your customers. This biased, imperfect information often leads to imprecise or even erroneous decisions.

Duplicated data also take up unnecessary space on your servers. Storage space that represents an additional cost for your company! Beyond the impact of data silos on your company’s decisions, strategies, or finances, there is also the organizational deficit.

When your data is in silos, your teams can’t collaborate effectively because they don’t know if they’re mining the same soil.

At a time where collective intelligence is a cardinal value, this is undoubtedly the most harmful event caused by data silos.

Does Your Company Suffer From Data Silos?

There are many causes for siloed data. Most often, they are associated with the history of your information systems. Over the years, these systems were built as a patchwork for business applications that were not always designed with interoperability in mind.

Moreover, a company is like a living organism. It welcomes new employees when others leave. In everyday life, spreading data culture throughout the workforce is a challenge! Finally, there is the place of data in the key processes of organizations.

Today data is central. But when you go back 5 to 10 years ago, it was much less so. Now that you know that you are suffering from data silos, you need to take action. 

How do you get rid of Data Silos?

To get started on the road to eradicating data silos, you need to proceed methodically.

Start by recognizing that the process will inevitably take some time. The prerequisite is a creating a detailed mapping of all your databases and information systems. These can be produced by different tools and solutions such as emails, CRMs, various spreadsheets, financial documents, customer invoices, etc.

It is also necessary to start by identifying all your data sources in order to centralize them in a unique repository. To do this, you can for example create gaps between the silos by using specific connectors, also called APIs. The second option is to implement a platform on your information system that will centralize all the data.

Working as a data aggregator, this platform will also consolidate data by tracking duplicates and keeping the most recent information. A Data Catalog Solution will prevent the reappearance of data silos once deployed.

But beware, data quality, optimized circulation between departments, and coordinated use of data to improve performance is also a human project.

Sharing best practices, training, raising awareness – in a word, creating a data culture within the company – will be the key to eradicating data silos once and for all.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Essential Keys to a Successful Cloud Migration

Actian Corporation

June 8, 2021

data transfer to cloud technology data storage,futuristic of data transfer,online data storage technology. vector illustration

The recent COVID-19 pandemic has brought about major changes in the work culture, and the Cloud is becoming an essential part of that culture by offering employees access to the company’s data, wherever they are. But why migrate? How do you migrate? And for what benefits? Here is an overview:

Head in the clouds and feet on the ground, that’s the promise of the Cloud, which has proven to be an essential tool for business continuity during the health crisis.

In a study conducted by Vanson Bourne at the end of 2020, it appears that more than 8 out of 10 business leaders (82%), accelerated their decision to migrate their critical data and business functions to the Cloud, after facing the COVID-19 crisis. 91% of survey participants say they have become more aware of the importance of data in the decision-making process since the crisis began.

Cloud and data. A duo that is now inseparable from business performance.

A reality that is not limited to a specific market. The plebiscite for Cloud data migration is almost worldwide. The Vanson Bourne study highlights a shared awareness on an international scale, with edifying figures:

  • United States (97%)
  • Germany and Japan (93%)
  • United Kingdom (92%)

Finally, 99% of Chinese executives are accelerating their plans to complete their migration to the Cloud. In this context, the question “Why migrate to the Cloud” is unequivocally answered: if you don’t, your competitors will do it before you and will definitely beat you to it.

The Main Benefits of Cloud Migration

Ensuring successful Cloud data migration is first and foremost a question of guaranteeing its availability in all circumstances. Once stated, this benefit leads to many others. If data is accessible everywhere and at all times, a company is able to meet the demand for mobility and flexibility expressed by employees.

A requirement that was fulfilled during the successive confinements and that should continue as the return to normalcy seems finally possible. Fully operational employees at home, in the office or in the countryside, not only promise increased productivity but also a considerable improvement in the user experience. HR benefits are not the only consequences of Cloud migration.

From a financial point of view, the Cloud opens the way to a better control of IT costs. By shifting data from a CAPEX dimension to an OPEX dimension, you can improve the TCO (Total Cost of Ownership) of your information system and your data assets. Better experience, budget control, the Cloud opens the way to optimized data availability.

Indeed, when migrating to the Cloud, your partners make commitments in terms of maintenance or backups that guarantee maximum access to your data. You should therefore pay particular attention to these commitments, which are referred to as SLAs (Service Level Agreements).

Finally, by migrating data to the cloud, you benefit from the expertise and technical resources of specialized partners who deploy resources that are far superior to those that you could have on your own.

How to Successfully Migrate to the Cloud

Data is, After Human Resources, the Most Valuable Asset of a Company

This is one of the reasons why companies should migrate to the Cloud. But the operation must be carried out in the best conditions to limit the risk of data degradation, as well as the temporary unavailability that impacts your business.

To do this, preparation is essential and relies on one prerequisite: the project does not only concern IT teams, but the entire company. 

Support, reassurance, training: the triptych that is essential to any change management process must be applied. Then make sure you give yourself time. Avoid the Big Bang mode, which could irritate your teams and dampen their enthusiasm. Even if the Cloud migration of your data should go smoothly, put all the chances on your side by making backups of your data.

Rely on redundancy to prepare for any eventuality, including (and especially!) the most unlikely. Once the deployment on the cloud is complete, ensure the quality of the experience for your employees. By conducting rigorous long-term project management, you can easily identify if you need to make adjustments to your initial choices.

The scalability of the Cloud model is a strength that you should seize upon to constantly adapt your strategy.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Data Governance Framework | S03-E01 – Importance of Metadata

Actian Corporation

June 2, 2021

zeenea effective data governance season 3 episode 1

This is the first episode of our third and final season of “The Effective Data Governance Framework”.

Divided into two episodes, this final season will focus on the implementation of metadata management with a data catalog.

For this first episode, we will give you the right questions to ask yourself to build a metamodel for your metadata.

Season 1: Alignment

Evaluate your Data maturity

Specify your Data strategy

Getting sponsors

Build a SWOT analysis

Season 2: Adapting

Organize your Data Office

Organize your Data Community

Creating Data Awareness

Season 3: Implementing Metadata Management with a Data Catalog

The importance of metadata

6 weeks to start your data governance journey

In our previous Season, we explained gave you our tips on how to build your Data Office, organize your Data Community, and build your Data Awareness.

In this third season, you will step into the real world of implementing a Data Catalog where Seasons 1 and 2 helped you to specify your Data Journey Strategy.

In this episode, you will learn how to ask the right questions for designing your Metamodel.

The Importance of Metadata

Metadata management is an emerging discipline and is necessary for enterprises wishing to bolster innovation or regulatory compliance initiatives on their data assets.

Many companies are therefore trying to establish their convictions on the subject and brainstorm solutions to meet this new challenge. As a result, metadata is increasingly being managed, alongside data, in a partitioned and siloed way that does not allow the full, enterprise-wide potential of this discipline.

Before beginning your data governance implementation, you will have to cover different aspects, ask yourself the right questions and figure out how to answer them.

Our Metamodel Template is a way to identify the main aspects when it comes to data governance by asking the right questions and in each case, you decide on its relevance.

These questions can also be used as support for your data documentation model and can provide useful elements to data leaders.

The Who

  • Who created this data?
  • Who is responsible for this data?
  • Who does this data belong to?
  • Who uses this data?
  • Who controls or audits this data?
  • Who is accountable on the quality of this data?
  • Who gives access to this data?

The What

  • What is the “business” definition for this data?
  • What are the associated business rules of this data?
  • What is the security/confidentiality level of this data?
  • What are the acronyms or aliases associated with this data?
  • What are the security/confidentiality rules associated with this data?
  • What is the reliability level (quality, velocity, etc.) of this data?
  • What are the authorized contexts of use (related to confidentiality for example)?
  • What are the (technical) contexts of use possible (or not) for this data?
  • Is this data considered a “Golden Source”?

The Where

  • Where is this data located?
  • Where does this data come from? (a partner, open data, internally, etc.)
  • Where is this data used/shared?
  • Where is this data saved?

The Why

  • Why are we storing this data? (rather than treating its flow)?
  • What is this data’s current purpose/usage?
  • What are the possible usages for this data? (in the future)

The When

  • When was the data created?
  • When was this data last updated?
  • What is this data’s life cycle? (update frequency)?
  • How long are we stocking this data for?
  • When does this data need to be deleted?

The How

  • How is this data structured? (diagram)?
  • How do your systems consume this data?
  • How do you access this data?

Start Defining Your Metamodel Template

These questions can serve as a foundation for building your data documentation model and providing data consumers with the elements that are useful to them.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Data Governance Framework | S02-E03 – Data Awareness

Actian Corporation

May 30, 2021

zeenea effective data governance framework episode 3 season 2

This is the final episode of the second season of the “The Effective Data Governance Framework” series.

Divided into three parts, this second part will focus on Adaptation. This consists of: 

  • Organizing your Data Office.
  • Building a data community.
  • Creating Data Awareness.

For this third and final episode of the season, we will help you use awareness support techniques that reduce the efforts needed to realize communicative tasks, make anyone aware of what the Data Governance Team is doing, and get buy-in and alignment at all levels.

Season 1: Alignment

Evaluate your Data maturity

Specify your Data strategy

Getting sponsors

Build a SWOT analysis

Season 2: Adapting

Organize your Data Office

Organize your Data Community

Creating Data Awareness

Season 3: Implementing Metadata Management With a Data Catalog

The importance of metadata

6 weeks to start your data governance journey

In the last episode, we explained how to organize your Data Community by building your Data Chapters and Data Guilds

In this episode, we will help you use awareness support techniques that reduce the effort needed to realize communicative tasks and create data awareness on the enterprise level.

We advise to use the SMART framework to plan and execute the Data Awareness program.

What are SMART Goals?

  • Specific: What do you want to accomplish? Why is this goal important? Who is involved? What resources are involved?
  • Measurable: Are you able to track your progress? How will you know when it’s accomplished?
  • Achievable: Is achieving this goal realistic with effort and commitment? Do you have the resources to achieve this goal? If not, how will you get them?
  • Relevant: Why is this goal important? Does it seem worthwhile? Is this the right time? Does this match efforts/needs?
  • Timely: When will you achieve this goal?

The “SMART” Method for Your Data Teams

If you think about the level of reach a team has, you can summarize them in 3 categories:

  • The Control sphere is the one your Data Team can reach directly and interacts
  • The Influence sphere is the level where you can find sponsors and get help from
  • The Concern sphere consists of the C levels who need to be informed on how things are progressing from a high level perspective.

In other words, you will have to touch all the stakeholders involved but with different means, timing and interactions.

Spend time creating nice formats, and pay attention to the form of all your artifacts.

Examples of SMART Tasks

You fill find below examples of SMART tasks:

For the Control sphere, we advise you to do the following:

  • Deliver trainings (for both Data Governance teams as well as End users).
  • Deliver presentations dedicated to teams (Strategy, OKRs, Roadmap, etc).
  • Keep your burn-down charts and all visual management tools displayed at any time.

For the Influence sphere, we advise you to:

  • Celebrate your first milestones.
  • Organize sprint demos.
  • Display OKRs teams constantly.

And for the Concern sphere, we advise you to:

  • Celebrate the end of a project.
  • Organize product demos.
  • Record videos and make them available.
actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Data Governance Framework | S02-E02 – Data Community

Actian Corporation

May 18, 2021

zeenea effective data governance framework episode 2 season 2

This is the second episode of the second season of  “The Effective Data Governance Framework” series.

Divided into three parts, this second part will focus on Adaptation. This consists of: 

  • Organizing your Data Office.
  • Building a data community.
  • Creating Data Awareness.

For this second episode, we will give you the keys to organizing an efficient and effective data community in your company.

Season 1: Alignment

Evaluate your Data maturity

Specify your Data strategy

Getting sponsors

Build a SWOT analysis

Season 2: Adapting

Organize your Data Office

Organize your Data Community

Creating Data Awareness

Season 3: Implementing Metadata Management With a Data Catalog

The importance of metadata

6 weeks to start your data governance journey

Spotify Feature Teams: A Good Practice, or a Failure?

In the last episode, we explained how to build your Data Office with Personas and the Spotify Feature Teams paradigm.

The Spotify model has been criticized because there have been failures at companies that tried to implement it.

The three main reasons were:

  • Autonomy is nice but it does not mean that teams can do what they want and there is a need to emphasize alignment.
  • Key results need to be defined at the leadership level and this is why building your OKRs are the right thing to do.
  • Autonomy means accountability and the teams have to be measured and the fact that the increments they are working on need to be done and the definition of “Done” has to be specified.

We will focus in this episode on the Chapters and Guilds  and how to organize and better leverage your Data Community.

How to Organize Your Chapters and Guilds

Chapters

Collaboration in Chapters and Guilds needs specific knowledge and experience and it is wrong to assume that teams know Agile Practices.

When teams are growing, there is a need to have dedicated support and therefore, the Program Managers in charge of data related topics are accountable for the processes and organization of the Data Community.

At the highest level, organizing your data community means sharing knowledge at all levels: technological, functional, or even specific practices around data related topics.

The main drivers to focus on the Chapters organization are:

  • Teams miss information.
  • Teams miss knowledge.
  • Teams repeat mistakes.
  • Teams need ceremonies and agile common agreed practices.

Chapters meet regularly and often.

We advise to meet once a month. When too big, a Chapter can be split into smaller groups. Even if it is a position that can change overtime, a Chapter needs a leader, and not a manager.

They are in charge of animating and making it efficient by

  • Getting the right people involved.
  • Sharing outcomes with upper level management.
  • Coordinating and moderating meetings.
  • Helping to establish transparency.
  • Finding a way of sharing and keeping available all the knowledge shared.
  • Defining the Chapter: why, for whom and what it is meant for.

A tip is to define an elevator pitch for the Chapter.

The Chapter leader is also responsible for building a backlog to avoid endless discussions with no outcome.

Typically the backlog consists in the following topics:

Data Topics

  • Chapter data people culture.
  • Chapter data related topics in continuous improvement.
  • Chapter data practices.
  • Chapter data processes.
  • Chapter data tools.

Generic Topics

  • Chapter continuous improvement.
  • Chapter feedback collection.
  • Chapter agility practices.
  • Chapter generic tools.
  • Chapter information sharing.
  • Chapter education program.

The Chapter Lead is in charge of communicating outside of his Chapter with other Chapter leaders and has to get time allocation to animate.

How to Start a Chapter

  • Identify the community and all members.
  • Name the Chapter.
  • Organize the first chapter meeting.
  • Define elevator statement.
  • Initialize your the Chapter web page (and keep it updated for future new members onboarding).
  • Negotiate and build the first backlog.
  • Plan the meetings.

Guilds

Guilds should be organized differently and in a self organized way.

The reason for Guilds to exist is passion and the teams are only built on a voluntary base.

In order to avoid the syndrome of too many useless meetings, we advise to allow only Guilds to meet in certain circumstances like:

  • Trainings, workshops but in short formats like in BBLs (Brown Bag Lunch) for the topics they built the Guild for.
  • Q&A sessions with top executives to emphasize the Why of the data strategy.
  • Hack days to crack a topic.
  • Post mortem meetings after a major issue has occurred.
actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Data Governance Framework | S02-01 – Organizing Your Data Office

Actian Corporation

May 17, 2021

organizing your data office

This is the first episode of the second season of “The Effective Data Governance Framework” series.

Divided into three parts, this second part will focus on Adaptation. This consists of: 

  • Organizing your Data Office
  • Building a data community  
  • Creating Data Awareness

For this first episode, we will give you the keys to building your data personas and setting up a clear and well-defined Data Office. 

Season 1: Alignment

Evaluate your Data maturity

Specify your Data strategy

Getting sponsors

Build a SWOT analysis

Season 2: Adapting

Organize your Data Office

Organize your Data Community

Creating Data Awareness

Season 3: Implementing Metadata Management With a Data Catalog

The importance of metadata

6 weeks to start your data governance journey

In the first season, we shared our best practices to help you align your data strategy with your company. For us, it is essential to:

  • Assess the maturity of your data.
  • Specify your Data Strategy by building OKRs.
  • Get sponsorship.
  • Build an effective SWOT analysis.

In this first episode, we will teach you how to build your Data Office.

The Evolution of Data Offices in Companies

We believe in Agile Data Governance.

Previous implementations of data governance within organizations have rarely been successful. The Data Office often focuses too much on technical management or a strict control of data.

For data users who strive to experiment and innovate around data, Data Office behavior is often synonymous with restrictions, limitations, and cumbersome bureaucracy.

Some will have gloomy visions of data locked up in dark catacombs, only accessible after months of administrative hassle. Others will recall the wasted energy at meetings, updating spreadsheets and maintaining wikis, only to find that no one was ever benefiting from the fruits of their labor.

Companies today are conditioned by regulatory compliance to guarantee data privacy, data security, and to ensure risk management.

That said, taking a more offensive approach towards improving the use of data in an organization by making sure the data is useful, usable and exploited is a crucial undertaking.

Using modern organizational paradigms with new ways of interacting is a good way to set up an efficient Data Office flat organization.

Below are the typical roles of a Data Office, although very often, some roles are carried out by the same person:

  • Chief data officer
  • Data related Portfolio/Program/Project managers
  • Data Engineers / Architects
  • Data scientists
  • Data analysts
  • Data Stewards

Creating Data Personas

An efficient way of specifying the roles of Data Office stakeholders is to work on their personas.

By conducting one on one interviews, you will learn a lot about them: context, goals and expectations. The OKRs map is a good guide for building those by asking accurate questions.

Here is an example of a persona template:

Some Useful Tips:

  • Personas should be displayed in the office of all Data Office team members.
  • Make it fun, choose an avatar or a photo for each team member, write a small personal and professional bio, list their intrinsic values and work on the look and feel.
  • Build one persona for each person, don’t build personas for teams
  • Be very precise in the personas definition interviews, rephrase if necessary.
  • Treat people with respect and consider all ideas equally.
  • Print them and put them on the office walls for all team members to see.

Building Cross-Functional Teams

In order to get rid of Data and organizational silos, we recommend you organize your Data Office in Feature Teams (see literature on the Spotify feature teams framework on the internet).

The idea is to build cross functional teams to address a specific feature expected by your company.

The Spotify Model Defines the Following Teams:

Squads

Squads are cross-functional, autonomous teams  that focus on one feature area. Each Squad has a unique mission that guides the work they do. 

In season 1, episode 2, in our OKRs example, the CEO has 3 OKRs and the first OKR (Increase online sales by 2%) has generated 2 OKRs:

  • Get the Data Lake ready for growth, handled by the CIO
  • Get the data governed for growth, handled by the CDO.

There would then be 2 squads:

  • Feature 1: get the Data Lake ready for growth
  • Feature 2: get data governed for growth.

Tribes

At the level below, multiple Squads coordinate within each other on the same feature area. They form a Tribe. Tribes help build alignment across Squads. Each Tribe has a Tribe Leader who is responsible for helping coordinate across Squads and encouraging collaboration.

In our example, for the Squad in charge of the feature “Get Data Governed for growth”, our OKRs map tells us that there is a Tribe in charge of “Get the Data Catalog ready”.

Chapter

Even though Squads are autonomous, it’s important that specialists (Data Stewards, Analysts) align on best practices. Chapters are the family that each specialist has, helping to keep standards in place across a discipline.

Guild

Team members who are passionate about a topic can form a Guild, which essentially is a community of interest (for example: data quality). Anyone can join a Guild and they are completely voluntary. Whereas Chapters belong to a Tribe, Guilds can span different Tribes. There is no formal leader of a Guild. Rather, someone raises their hand to be the Guild Coordinator and help bring people together.

Here is an example of a Feature Team organization:

Don’t miss next week’s SE02 E01:

Building your Data Community, where we will help you adapt your organization in order to become more data-driven.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Data Governance Framework | S01-E04 – SWOT Analysis

Actian Corporation

May 9, 2021

episode 4- SWOT analysis cover

This is the fourth episode of our series “The Effective Data Governance Framework”. Split into three seasons, this first part will focus on Alignment: understanding the context, finding the right people, and preparing an action plan for your data-driven journey. 

This episode will give you the keys to building a concrete and actionable SWOT analysis.

Season 1: Alignment

Evaluate your Data maturity

Specify your Data strategy

Getting sponsors

Build a SWOT analysis

Season 2: Adapting

Organize your Data Office

Organize your Data Community

Creating Data Awareness

Season 3: Implementing Metadata Management With a Data Catalog

The importance of metadata

6 weeks to start your data governance journey

In our previous episode, we discussed the different means to obtain the right level of sponsorship to ensure endorsement from decision makers.

This week, we will teach you how to build a concrete and actionable SWOT analysis to assess the company Data Governance Strategy in the best possible way.

What is a SWOT Analysis?

Before we give our tips and tricks on building the best SWOT analysis possible, let’s go back and define what a SWOT analysis is. 

A SWOT analysis is a technique used to determine and define your Strengths, Weaknesses, Opportunities, and Threats (SWOT). Here are some examples:

Strengths

This element addresses the things your company or department does especially well. This can be a competitive advantage or a particular attribute on your product or service. An example of a “strength” for a data-driven initiative would be “Great data culture” or “Data shared across the entire company”. 

Weaknesses

Once your strengths are listed, it is important to list your company’s weaknesses. What is holding your business or project back? Taking our example, a weakness in your data or IT department could be “Financial limitations”, “Legacy technology”, or even “Lack of a CDO”. 

Opportunities

Opportunities refer to favorable external factors that could give an organization a competitive advantage. Few competitors in your market, emerging needs for your product.. all of these are opportunities for a company. In our context, an opportunity could be “Migrating to the Cloud” or “Extra budget for data teams”. 

Threats

The final element of a SWOT analysis is Threats – everything that poses a risk to either your company itself or its likelihood of success or growth. For a data team, a threat could be “Stricter regulatory environment for data” for example.

How to Start Building a Smart SWOT Analysis

Building a good SWOT analysis means adopting a democratic approach that will ensure you don’t miss important topics.

There are 3 principles you should follow:

Gather the Right People

Invite different parts of your Data Governance Team stakeholders from Business to IT, CDO and CPO representatives. You’ll find that different groups within your company will have entirely different perspectives that will be critical to making your SWOT analysis successful.

Throw Your Ideas Against the Wall

Doing a SWOT analysis consists, in part, in brainstorming meetings. We suggest giving out sticky-notes and encouraging the team to generate ideas on their own to start things off. This prevents group thinking and ensures that all voices are heard.

This first ceremony should be no more than 15 minutes of individual brainstorming, Put all the sticky-notes up on the wall and group similar ideas together. 

You can allot additional time to enable anyone to add notes at this point if someone else’s idea sparks a new thought.

Rank the Ideas

It is now time to rank the ideas. We suggest giving a certain number of points to each participant. Each participant will rate the ideas by assigning points to the ones they consider most relevant. You will then be able to prioritize them with accuracy.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.