Data Management

5 Uses Cases for Hybrid Cloud Data Management

Actian Corporation

February 25, 2022

Hybrid Cloud Data Management

With the rise of cloud computing, many organizations are opting to use a hybrid approach to their data management. Even though many companies still rely on on-premises storage, the benefits of having cloud storage as a backup or disaster recovery plan can be significant. This post will give you five of the most popular use cases for hybrid cloud data management.

Why Hybrid Cloud Data Management?

Hybrid cloud data management isn’t a new concept, but it’s finally starting to hit its stride as a viable option for enterprise data management.  It utilizes a mixture of on-premises and cloud storage and cloud computing to handle all aspects of a company’s data needs. Often, it’s the merger of on-premises databases or enterprise data warehouses (EDW) with cloud storage, SaaS application data and/or a cloud data warehouse (CDW). The benefits of this hybrid approach are twofold: it provides a backup plan for disaster recovery situations, and it gives an organization the ability to scale up as needed without purchasing additional hardware.

Backup and Disaster Recovery

One of the most obvious benefits of hybrid cloud data management is that it provides a backup for your data. If your on-premises storage system fails or you lose some important data, you can rely on your cloud storage to get it back. It will act as an additional fail-safe plan in case anything happens to your on-site server.

Data Accessibility

Data is not just one homogeneous entity. Many companies can feel hampered by data access. They may not have the in-house expertise or budget to handle the IT demands of data storage and real-time access. Through a hybrid cloud environment, your business can access data and applications stored in both on-premises and off-site locations. Global companies can store data closer to applications or users to improve processing time and reduce latency without having to have local data centers or infrastructure.

Data Analytics

Currently, many businesses are combining internal data sources with external data sources from partners or public sources for improved data analytics. A hybrid data warehouse can allow data teams to combine this third-party data with internal data sources to gain greater insights for decision-making. Data engineers can reduce the amount of effort required to source and combine data needed for users to explore new analytical models.

Data Migration

When an organization migrates their storage to the cloud, they can take advantage of public, private, and hybrid cloud solutions. This means utilizing a host of services, including backup storage, disaster recovery solutions, analytics, and more. All while paying less money on infrastructure costs and avoiding large capital expenses.

Data Compliance

The adoption of a hybrid data warehouse can relieve some of the compliance burdens that can often accompany stored data. For example, retired systems may leave behind orphaned databases, often with useful, historic data. This can create a data gap for analytic teams, but it can also pose a security and compliance risk for the business. Cloud service providers have teams of experts that work with governments and regulators globally to develop standards for things such as data retention times and security measures. Additionally, leveraging the cloud for data storage can also help address the challenges of data residency and data sovereignty regulations, which can become complex as data moves across geographical boundaries.

Regardless of where you are on your cloud journey, data is the most valuable asset to any organization. The cloud is an increasingly important component as businesses look for ways to leverage their data assets to maintain competitive advantage. Learn more about how the Actian Data Platform is helping organizations unlock more value from their data.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Platform

Does Your Organization Have a Data Platform Leader? It Could Soon.

Teresa Wingfield

February 17, 2022

data platform leader

There’s no one-size-fits-all solution for a modern data platform, and there likely never will be with the proliferation of multiple public and private cloud environments, entrenched on-premises data centers, and the exponential rise in edge computing – data sources are multiplying almost at the rate of data itself.

Today’s data platforms increasingly take a broad multi-platform approach that incorporates a wide range of data services (e.g. data warehouse, data lake, transactional database, IoT database and third-party data services),  and integration services that support all major clouds and on-premise platforms and applications that run on and across these environments. Modern data platforms need a data fabric – technology that enables data that is distributed across different areas to be accessed in real-time in a unifying data layer,  – to drive data flow orchestration, data enrichment, and automation To meet the varied requirements of users across an organization including data engineers, data scientists, business analysts and business users, the platform should also incorporate shared management and security services, as well as support a wide range of application development and analytical tools.

However, these needs create a singular challenge: who’s going to manage the creation and maintenance of such a platform? That’s where the role of the platform leader comes in. Just as we’ve seen the creation of roles like Chief Data Officer and Chief Diversity Officer in response to critical needs, organizations require a highly skilled individual to manage the creation and maintenance of their platform(s). Enter the data platform leader – someone with a broad understanding of databases and streaming technologies, as well as a practical understanding of how to facilitate frictionless access to these data sources, how to formulate a new purpose, vision and mission for the platform and how to form close partnerships with analytics translators. We’ll get to those folks in a minute.

Developing a New Purpose, Vision and Mission

Why must a data platform leader develop a new purpose, vision and mission? Consider this: data warehouse users have traditionally been data engineers, data scientists and business analysts who are interested in complex analytics. These users typically represent a relatively small percentage of an organization’s employees. The power and accessibility of a data platform capable of running not just in the data center, but also in the cloud or at the edge, will invariably bring in a broader base of business users who will use the platform to run simpler queries and analytics to make operational decisions.

However, accompanying these users will be new sets of business and operational requirements. To satisfy this ever-expanding user base and their different requirements, the data platform leader will need to formulate a new purpose for the platform (why it exists), a new vision for the platform (what it hopes to deliver) and a new mission (how will it achieve the vision).

Facilitating Data Service Convergence

Knowledge of relational databases with analytics-optimized schemas and/or analytic databases has long been part of a data warehouse manager’s wheelhouse. However, the modern data platform extends access much further, enabling access to data lakes and transactional and IoT databases, and even streaming data. Increasing demand for real-time insights and non-relational data that can enable decision intelligence are bringing these formerly distinct worlds closer together. This requires the platform leader to have a broad understanding of databases and streaming technologies as well as a practical understanding of how to facilitate frictionless access to these data sources.

Enabling Frictionless Data Access

A data warehouse typically includes a semantic layer that represents data so end users can access that data using common business terms. A modern data platform, though, demands more. While a semantic layer is valuable, data platform leaders will need to enable more dynamic data integration than is typically sufficient to support a centralized data warehouse design. Enter the data fabric to provide a service layer that enables real-time access to data sourced from the full range of the data platform’s various services. The data fabric offers frictionless access to data from any source located on-premises and in the cloud to support the wide range of analytic and operational use cases that such a platform is intended to serve.

Working With Analytics Translators

I mentioned earlier that data platform leaders would need the ability to form close partnerships with analytics translators. Let’s start with what an analytics translator does and then we’ll get to why a close relationship is important.

According to McKinsey & Company, the analytics translator serves the following purpose:

“At the outset of an analytics initiative, translators draw on their domain knowledge to help business leaders identify and prioritize their business problems, based on which will create the highest value when solved. These may be opportunities within a single line of business (e.g., improving product quality in manufacturing) or cross-organizational initiatives (e.g., reducing product delivery time).”

I expect the analytics translator and the data platform leader will become important partners. The analytics translator will be invaluable in establishing data platform priorities, and the platform leader will provide the analytics translator with key performance indicators (KPIs) on mutually-agreed-upon usage goals.

In conclusion, the data platform leader has many soft and hard skillset requirements in common with a data warehouse manager, but there are a few fundamental and significant differences. The key difference includes developing a new purpose, vision and mission, having expertise in new data services and data fabrics, knowing how best to access those services, and possessing the ability to form close partnerships with analytics translators.

teresa user avatar

About Teresa Wingfield

Teresa Wingfield is Director of Product Marketing at Actian, driving awareness of the Actian Data Platform's integration, management, and analytics capabilities. She brings 20+ years in analytics, security, and cloud solutions marketing at industry leaders such as Cisco, McAfee, and VMware. Teresa focuses on helping customers achieve new levels of innovation and revenue with data. On the Actian blog, Teresa highlights the value of analytics-driven solutions in multiple verticals. Check her posts for real-world transformation stories.
Data Intelligence

What Makes a Data Catalog “Smart”? #5 – User Experience

Actian Corporation

February 16, 2022

smart-data-catalog-5-user-experience

A data catalog harnesses enormous amounts of very diverse information, and its volume will grow exponentially. This will raise 2 major challenges:

  • How to feed and maintain the volume of information without tripling (or more) the cost of metadata management?
  • How to find the most relevant datasets for any specific use case?

A data catalog should be Smart to answer these 2 questions, with smart technological and conceptual features that go wider than the sole integration of AI algorithms.

In this respect, we have identified 5 areas in which a data catalog can be “Smart” – most of which do not involve machine learning:

  1. Metamodeling
  2. The data inventory
  3. Metadata management
  4. The search engine
  5. User experience

A data catalog should also be smart in the experience it offers to its different pools of users. Indeed, one of the main challenges with the deployment of a data catalog is its level of adoption from those it is meant for: data consumers. And user experience plays a major role in this adoption.

User Experience Within the Data Catalog

The underlying purpose of user experience is the identification of personas whose behavior and objectives we are looking to model in order to provide them with a slick and efficient graphic interface. Pinning down personas in a data catalog is challenging – it is a universal tool that provides added value for any company regardless of its size, across all sectors of activity anywhere in the world.

Rather than attempting to model personas that are hard to define, it’s possible to handle the situation by focusing on the issue of data cataloging adoption. Here, there are two user populations that stand out:

  • Metadata producers who feed the catalog and monitor the quality of its content – this population is generally referred to as Data Stewards.
  • Metadata consumers who use the catalog to meet their business needs – well will call them Users.

These two groups are not totally unrelated to each other of course: some Data Stewards will also be Users.

The Challenges of Enterprise-Wide Catalog Adoption

The real value of a data catalog resides in large-scale adoption by a substantial pool of (meta) data consumers, not just the data management specialists.

The pool of data consumers is very diverse. It includes data experts (engineers, architects, data analysts, data scientists, etc.), business people (project managers, business unit managers, product managers, etc.), compliance and risk managers. And more generally, all operational managers are likely to leverage data to improve their performances.

Data Catalog adoption by Users is often slowed down for the following reasons:

  • Data catalog usage is sporadic. They will log on from time to time to obtain very specific answers to specific queries. They rarely have the time or patience to go through a learning curve on a tool they will only use periodically – weeks can go by between catalog usage.
  • Not everyone has the same stance on metadata. Some will focus more on technical metadata, others will focus heavily on the semantic challenges, and others might be more interested in the organizational and governance aspects.
  • Not everybody will understand the metamodel or the internal organization of the information within the catalog. They can quickly feel put off by an avalanche of concepts that feel irrelevant to their day-to-day needs.

The Smart Data Catalog attempts to jump these hurdles in order to accelerate catalog adoption. Here is how the Actian Data Intelligence Platform meets these challenges.

How the Actian Data Intelligence Platform Facilitates Catalog Adoption

The first solution is the graphic interface. The Users’ learning curve needs to be as short as possible. Indeed, the User should be up and running without the need for any training. To make this possible, we made a number of choices.

The first choice was to provide two different interfaces, one for the Data Stewards and one for the Users:

Studio: The management and monitoring tool for the catalog content – an expert tool solely for the Data Stewards.

Explorer: For the Users, it provides them with the simplest search and exploration experience possible.

Our approach is aligned with the user-friendly principles of marketplace solutions – the recognized specialists in catalog management (in the general sense). These solutions usually have two applications on offer. The first, a “back office” solution, which enables the staff of the marketplace (or its partners) to feed the catalog in the most automated manner possible and control its content to ensure its quality. The second application, for the consumers, usually takes the form of an e-commerce website and enables end-users to find articles or explore the catalog. Studio and Explorer reflect these two roles.

The Information is Ranked in Accordance With the Role of the User Within the Organization

Our second choice is still at the experimental stage and consists in dynamically adapting the information hierarchy in the catalog according to User profiles.

This information hierarchy challenge is what differentiates a data catalog from a marketplace type catalog. Indeed, a data catalog’s information hierarchy depends on the operational role of the user. For some, the most relevant information in a dataset will be technical: location, security, formats, types, etc. Others will need to know the data semantics and their business lineage. Others still will want to know the processes and controls that drive data production – for compliance or operational considerations.

The Smart Data Catalog should be able to dynamically adjust the structure of the information to adapt to its different prisms. 

The last remaining challenge is the manner in which the information is organized in the catalog in the form of exploration paths by theme (something similar to shelving in a marketplace). It is difficult to find a structure that agrees with everybody. Some will explore the catalog along technical lines (systems, applications, technologies, etc.). Others will explore the catalog from a more functional perspective (business domains), others still from a semantic angle (through business glossaries, etc.).

The challenge of having everyone agree on a sole universal classification seems (to us) insurmountable. The Smart Data Catalog should be adaptable and should not ask Users to understand a classification that makes no sense to them. Ultimately, user experience is one of the most important success factors for a data catalog.

For more information on how a Smart search engine enhances a Data Catalog, download our eBook: What is a Smart Data Catalog?”.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Intelligence

What Makes a Data Catalog “Smart”? #4 – The Search Engine

Actian Corporation

February 16, 2022

smart-data-catalog-4-search-engine

A data catalog harnesses enormous amounts of very diverse information, and its volume will grow exponentially. This will raise 2 major challenges:

  • How to feed and maintain the volume of information without tripling (or more) the cost of metadata management?
  • How to find the most relevant datasets for any specific use case?

We think that a data catalog should be Smart to answer these 2 questions, with smart technological and conceptual features that go wider than the sole integration of AI algorithms.

In this respect, we have identified 5 areas in which a data catalog can be “Smart” – most of which do not involve machine learning:

  1. Metamodeling
  2. The data inventory
  3. Metadata management
  4. The search engine
  5. User experience

A Powerful Search Engine for an Efficient Exploration

Given the enormous volumes of data involved in an enterprise catalog, we consider the search engine the principal mechanism through which users can explore the catalog. The search engine needs to be easy to use, powerful, and, most importantly, efficient – the results must meet user expectations. Google and Amazon have raised the bar very high in this respect, and the search experience they offer has become a reference in the field.

This second-to-none search experience can be summed up thus:

  • I write a few words in the search bar, often with the help of a suggestion system that offers frequent associations of terms to help me narrow down my search.
  • The near-instantaneous response provides results in a specific order and I fully expect to find the most relevant one on page one.
  • Should this not be the case, I can simply add terms to narrow the search down even further or use the available filters to cancel out the non-relevant results.

Alas, the best currently on offer in the data cataloging market in terms of search capabilities seems to be limited to capable systems indexations, scoring, and filtering. This approach is satisfactory when the user has a specific idea of what they are looking for (high intent search) but can prove disappointing when the search is more exploratory (low intent search) or when the idea is simply to spontaneously suggest relevant results to a user (no intent).

In short, simple indexation is great for finding information whose characteristics are well known but falls short when the search is more exploratory. The results often include false positives and the order in which the search comes out is over-represented with exact matches.

A Multidimensional Search Approach

We decided from the get-go that a simple indexation system would prove limited and would fall short of providing the most relevant results for the users. We, therefore, chose to isolate the search engine in a dedicated module on the platform and to turn it into a powerful innovation (and investment) zone.

We naturally took an interest in the work of the founders of Google on Page Rank, their algorithm. Page Rank takes into account several dozen aspects (called features), amongst which are the density of the relation between different graph objects (hypertext links in the case of internet pages), the linguistic treatment of search terms, or the semantic analysis of the knowledge graph.

Of course, we do not have the means Google has, nor its expertise in terms of search result optimization. But we have integrated into our search engine several features that provide a high level of relevant results, and those features are permanently evolving.

We have integrated the following core features:

  • Standard, flat, indexation of all the attributes of an object (name, description, and properties) weighing it up in accordance with the type of property.
  • An NLP layer (Natural Language Processing) that takes into account the near misses (typing or spelling errors).
  • A semantic analysis layer that relies on the processing of the knowledge graph.
  • A personalization layer that currently relies on a simple user classification according to their uses, and will in the future be enriched by individual profiling.

Smart Filtering to Contextualize and Limit Search Results

To complete the search engine, we also provide what we call a smart filtering system. Smart filtering is something we often find on e-commerce websites (such as Amazon, booking.com, etc.) and it consists in providing contextual filters to limit the search result. These filters work in the following way:

  • Only those properties that help reduce the list of results are offered in the list of filters – non-discriminating properties do not show up.
  • Each filter shows its impact – meaning the number of residual results once the filter has been applied.
  • Applying a filter refreshes the list of results instantaneously.

With this combination of multi-dimensional search and smart filtering, we feel that we offer a superior search experience to any of our competitors. And our decoupled architecture enables us to explore new approaches continuously, and rapidly integrate those that seem efficient.

For more information on how a Smart search engine enhances a Data Catalog, download our eBook: What is a Smart Data Catalog?”.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Intelligence

What Makes a Data Catalog “Smart”? #3 – Metadata Management

Actian Corporation

February 16, 2022

smart-data-catalog-3-metadata-management

A data catalog harnesses enormous amounts of very diverse information, and its volume will grow exponentially. This will raise 2 major challenges:

  • How to feed and maintain the volume of information without tripling (or more) the cost of metadata management?
  • How to find the most relevant datasets for any specific use case?

A data catalog should be Smart to answer these 2 questions, with smart technological and conceptual features that go wider than the sole integration of AI algorithms.

In this respect, we have identified 5 areas in which a data catalog can be “Smart” – most of which do not involve machine learning:

  1. Metamodeling
  2. The data inventory
  3. Metadata management
  4. The search engine
  5. User experience

It is in the field of metadata management that the notion of the Smart Data Catalog is most commonly associated with algorithms, machine learning, and AI.

How is Metadata Management Automated?

Metadata management is the discipline that consists of valuing the metamodel attributes for the inventoried assets. The workload required is usually proportional to the number of attributes in the metamodel and the number of assets in the catalog.

The role of the Smart Data Catalog is to automate this activity as much as possible, or at the very least to help the human operators (Data Stewards) do so in order to ensure greater productivity and reliability.

As seen in our last article, a smart connectivity layer enables the automation of part of the metadata but this automation is very much restricted to a limited subset of the metamodel – mostly technical metadata. A complete metamodel, even a modest one, also has dozens of metadata that cannot be extracted from the source systems registries (because they are not there, to begin with).

To solve this equation, several approaches are possible:

Pattern Recognition

The most direct approach consists in looking to identify patterns in the catalog in order to suggest metadata values for new assets.

Put simply, a pattern will include all the metadata of an asset and the metadata of its relations with other assets or other catalog entities. Pattern recognition is typically done with the help of machine learning algorithms.

The difficulty with the implementation of this approach is precisely qualifying the information assets in a numerical form in order to feed the algorithms and select the relevant patterns. A simple structural analysis is not enough: two datasets can contain identical data but in different structures. Relying on the identity of the data isn’t efficient either: two datasets can contain identical information but with different values. For example, 2020 client invoicing in one dataset, 2021 client invoicing in the other.

In order to solve this problem, the Actian Data Intelligence Platform relies on a technology called fingerprinting. In order to build the fingerprint, we pull up 2 types of features from our clients’ data:

  • A group of features adapted to the numerical data (mostly statistical indicators).
  • Data emanating from word embedding models (word vectorization) for the textual data.

Fingerprinting is at the heart of our intelligent algorithms.

The Other Embedded Approaches in a Suggestion Engine

While pattern recognition is indeed an efficient approach for suggesting the metadata of a new asset in a catalog, it rests on an important prerequisite: in order to recognize a pattern, there has to be one to recognize. In other words, this only works if there are a number of assets in the catalog (which is obviously not the case at the start of a project).

And it’s precisely in these initial phases of a catalog project that the metadata management load is the highest. It is, therefore, crucial to include other approaches likely to help the Data Stewards in these initial phases, when a catalog is more or less empty.

The the Actian Data Intelligence Platform suggestion engine, which provides intelligent algorithms to assist the management of the metadata, also provides other approaches (which we enrich regularly). 

Here are some of these approaches:

  • Structural similarity detection.
  • Fingerprint similarity detection.
  • Name approximation.

This suggestion engine, which analyzes the catalog content in order to determine the probable values of the metadata from the assets that have been integrated, is an everlasting subject of experimentation. We regularly add new approaches, sometimes very simple and sometimes much more sophisticated. In our architecture, it is a dedicated service whose performances improve as the catalog grows and as we enrich our algorithms.

Actian Data Intelligence Platform has chosen to use the lead time as our main measuring metric for the productivity of the Data Stewards (which is the ultimate objective of smart metadata management). Lead time is a notion that stems from lean management and which measures, in a data catalog context, the time elapsed between the moment an asset is inventoried and the moment all its metadata has been valued.

For more information on how Smart metadata management enhances a Data Catalog, download our eBook: What is a Smart Data Catalog?”.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Intelligence

What Makes a Data Catalog “Smart”? #2 – The Data Inventory

Actian Corporation

February 16, 2022

smart-data-catalog-2-data-inventory

A data catalog harnesses enormous amounts of very diverse information, and its volume will grow exponentially. This will raise 2 major challenges:

  • How to feed and maintain the volume of information without tripling (or more) the cost of metadata management?
  • How to find the most relevant datasets for any specific use case?

A data catalog should be Smart to answer these 2 questions, with smart technological and conceptual features that go wider than the sole integration of AI algorithms.

In this respect, we have identified 5 areas in which a data catalog can be “Smart” – most of which do not involve machine learning:

  1. Metamodeling
  2. The data inventory
  3. Metadata management
  4. The search engine
  5. User experience

The second way to make a data catalog “smart“ is through its inventory. A data catalog is essentially a thorough inventory of information assets that include a bunch of metadata, which helps harness the information as efficiently as possible. Setting up a data catalog, therefore, depends first of all on an inventory of the assets from the different systems.

Automating the Inventory: The Challenges

A declarative approach to building the inventory doesn’t strike us as particularly smart, however well thought out it may be. It involves a lot of work at the launching and the up-keeping of the catalog – in a fast-changing digital landscape, the initial effort quickly becomes redundant.

The first step in creating a smart inventory is of course to automate it. With a few exceptions, enterprise datasets are managed by system specialists (involving distributed filing systems, ERPs, relational databases, software packages, data warehouses, etc.). They manage all these systems along with all the metadata required for them to work properly. There is no need to recreate this information manually: you just need to connect to the different registries and synchronize the catalog content with the source systems.

In theory, this should be straightforward but putting it into practice is actually rather difficult. The fact is, there is no universal standard to which the different technologies conform for a universal means of access to their metadata.

The Essential Role of Connectivity to the System Sources

A smart connectivity layer is a key part of the Smart Data Catalog. For a more detailed description of the Actian Data Intelligence Platform’s connectivity technology, I recommend reading our previous eBook, The 5 Technological Breakthroughs of a Next-Generation Catalog, but its main characteristics are:

  • Proprietary – We do not rely on third parties so as to maintain a highly specialized extraction of the metadata.
  • Distributed – In order to maximize the reach of the catalog.
  • Open – Anyone looking to enrich the catalog can develop their own
  • connectors with ease.
  • Universal – It can synchronize any source of metadata.

This connectivity can not only read and synchronize the metadata contained in the source registries, it can also produce metadata.

This production of metadata requires more than simple access to the source system registries. It also requires access to the data itself, which will be analyzed by our scanners in order to enrich the catalog automatically.

To date, we produce 2 types of metadata:

  • Statistical analysis: To build a profile of the data – value distribution, rate of null values, top values, etc. (the nature of the metadata depends obviously on the native type of the data being analyzed).
  • Structural analysis: To determine the operational type of specific textual data (email, postal address, social security number, client code, etc. – the system is scalable and customizable).

The Inventory Mechanism Must Also be Smart

Our inventory mechanism is also smart in several ways:

  • Dataset detection relies on extensive knowledge of the storage structures, particularly in a Big Data context. For example, an IoT dataset made up of thousands of files of time series measures can be identified as a unique dataset (the number of files and their location being only metadata).
  • The inventory is not integrated into the catalog by default to prevent the import of technical or temporary datasets that would be of little use (either because the data is unexploitable, or because it is duplicated data).
  • The selection process for the assets that should be imported into the catalog also benefits from some assistance – we strive to identify the most appropriate objects for integration in the catalog (with a variety of additional approaches to make this selection).

For more information on how Smart Data Inventorying enhances a Data Catalog, download our eBook: What is a Smart Data Catalog?”.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Intelligence

What Makes a Data Catalog “Smart”? #1 – Metamodeling

Actian Corporation

February 16, 2022

smart-data-catalog-1-metamodeling

A data catalog harnesses enormous amounts of very diverse information, and its volume will grow exponentially. This will raise 2 major challenges:

  • How to feed and maintain the volume of information without tripling (or more) the cost of metadata management?
  • How to find the most relevant datasets for any specific use case?

We think that a data catalog should be Smart to answer these 2 questions, with smart technological and conceptual features that go wider than the sole integration of AI algorithms.

In this respect, we have identified 5 areas in which a data catalog can be “Smart” – most of which do not involve machine learning:

  1. Metamodeling
  2. The data inventory
  3. Metadata management
  4. The search engine
  5. User experience

A Universal and Static Metamodel Cannot be Smart

At an enterprise scale, the metadata required to harness in any meaningful way the informational assets can be considerable. And besides, metadata is specific to each organization, sometimes even amongst different populations within an organization. For example, a business analyst won’t necessarily seek the same information as an engineer or a product manager might.

Attempting to create a universal metamodel, therefore, does not seem very smart to us. Indeed, such a metamodel would have to adapt to a plethora of different situations, and will inevitably fall victim to one of the 3 pitfalls below:

  • Excessive simplicity which won’t cover all the use cases needed.
  • Excessive levels of abstraction with the potential to adapt to a number of contexts at the cost of arduous and time-consuming training – not an ideal situation for an enterprise-wide catalog deployment.
  • Levels of abstraction lacking depth and ultimately leading to a multiplicity of concrete concepts bourn out on a combination of notions emanating from a variety of different contexts – many of which will be useless in any specific context, rendering the metamodel needlessly complicated and potentially incomprehensible.

In our view, smart metamodeling should ensure a metamodel that adapts to any context and can be enriched as use cases or maturity levels develop over time.

The Organic Approach to a Metamodel

A metamodel is a field of knowledge and the formal structure of a knowledge model is referred to as an ontology.

An ontology defines a range of object classes, their attributes, and the relationships between them. In a universal model, the ontology is static – the classes, the attributes, and the relations are predefined, with varying levels of abstraction and complexity.

Actian Data Intelligence Platform chose not to rely on a static ontology but rather on a scalable knowledge graph.

The metamodel is therefore voluntarily simple at the start – there are only a handful of types, representing the different classes of information assets (data sources, datasets, fields, dashboards), each with a few essential attributes (name, description, contacts).

This metamodel is fed automatically by the technical metadata extracted from the datasources which vary depending on the technology in question (the technical metadata of a table in a data warehouse differs from the technical metadata of a file in a data lake).

This organic metamodeling is the smartest way to handle the ontology issue in a data catalog. Indeed, it offers several advantages:

  • The metamodel can adapt to each context, often relying on a pre-existing model, integrating the inhouse nomenclature and terminology without the need for a long and costly learning curve;
  • The metamodel does not need to be fully defined before using the data catalog – you will only need to focus on a few classes of objects and the few necessary attributes to cover the initial use cases. You can then load the model as catalog adoption progresses over time;
  • User feedback can be integrated progressively, improving catalog adoption, and as a result, ensuring return on investment for the metadata management.

Adding Functional Attributes to the Metamodel in Order to Facilitate Searching

There are considerable advantages to this metamodeling approach, but also one major inconvenience: since the metamodel is completely dynamic, it is difficult for the engine to understand the structure, and therefore difficult for it to help users feed the catalog and use the data (two core components of a Smart Data Catalog).

Part of the solution relates to the metamodel and the ontology attributes. Usually, metamodel attributes are defined by their technical types (date, number, chain of characters, list of values, etc.). With the Actian Data Intelligence Platform, these library types do include these technical types of course.

But they also include functional types – quality levels, confidentiality levels, personal touch, etc. These functional types enable the platform engine to better understand the ontology, refine the algorithms and adapt the representation of the information.

For more information on how Smart Metamodeling enhances a Data Catalog, download our eBook: What is a Smart Data Catalog?”.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Management

Data Democratization: Promise vs. Reality

Teresa Wingfield

February 10, 2022

data democratization lock and key

Enabling universal access to data can create opportunities to generate new revenue and drive operational efficiencies throughout an organization. Even more importantly, data democratization, as it’s known, is crucial to business transformation. For that reason, vendors have made a lot of promises about enabling data democratization—and not all have panned out. For instance, various vendors have touted data for the masses through self-service analytics for many years. The objective has been to make information accessible to non-technical users without requiring IT involvement. Vendors have focused their efforts on shielding users from underlying data complexities, making analytics tools easier to use, and expanding reach to users in any location throughout the world via the cloud.

However, even with simplified access to data, organizations still haven’t made the progress they would like to when it comes to democratizing data. While it has become more common for non-technical users to access data on their own, for the most part, they can only do so in certain situations. Barriers still stand in the way, making it difficult for users to access all the data they need for decision-making.

Here are the four top barriers to data democratization that organizations must overcome in 2022 to adopt new data platform approaches to help reduce cost and complexity.

1. Users Can’t Access Data in Silos

Organizations typically store data for analytics and decision-making in a centralized data warehouse or similar repository optimized for analytics. But that’s only a subset of all the data that might be useful. Much of it remains sequestered in disparate data silos that most users cannot access. To run the analytics they want and gain insights to inform new programs and processes, users need access to transactional databases, IoT databases, data lakes, streaming data, and more—data that may be spread across multiple data centers and multiple clouds. Several use cases come to mind, including automated personalized e-commerce offers, supply chain optimization, real-time quotes for insurance, credit approval and portfolio management.

2. Today’s Semantic Layers Aren’t Enough

A semantic layer is a business representation of data that helps users access data without IT assistance. Although semantic layers are great at shielding users from the underlying complexities of data, they are designed to represent the data in only one database at a time. Today’s users need a semantic layer that is more ubiquitous to connect to and interact with multiple data sources across multiple locations. As Gartner puts it, users need frictionless access to data—from any source located on-premises and in the cloud.

Data fabrics and data meshes are emerging data architecture designs that can make data more accessible, available, discoverable, and interoperable than a singularly-focused semantic layer can. A data fabric acts as a distributed semantic layer connecting multiple sources of data across multiple locations. A data mesh goes a step further, treating data as a product that is owned by teams who best understand the data and its uses.

3. Lack of Shared Services

Indirectly impacting data democratization is a lack of shared services. The absence of such services means that too much time and resources are spent on separate efforts to manage, maintain, and secure data, which leaves less time to focus on enabling data access and delivering business value to end users. Plus, inconsistencies in security, controls, upgrades, patches, and more—across multiple deployments—often result in time-consuming and costly consequences.

4. Weak Tool Support

The purpose of and value delivered by different types of analytical tools vary greatly, so different users—including data engineers, data scientists, business analysts, and business users—need different tools. Many data warehouse vendors, though, fail to provide flexible analytic and development tool integration, which limits the utility of the tools to users and limits the variety of use cases that a data warehouse can serve.

How to Progress Data Democratization Efforts

To overcome these data democratization challenges, organizations must ensure that business-critical systems can analyze, transact, and connect at their very best using the right tool for the right job. As we head into 2022, now is the time to consider if your data democratization platform is exceeding your expectations and fulfilling your business needs. Actian is leading the way with our data platform approach. The data platform must bring together a wide range of data processing and analytic capabilities that focus on easier access to data and less management overhead. As organizations tackle these challenges, they will be able to generate new revenue and drive operational efficiencies to truly transform their business.

This article was originally published on vmBlog.

teresa user avatar

About Teresa Wingfield

Teresa Wingfield is Director of Product Marketing at Actian, driving awareness of the Actian Data Platform's integration, management, and analytics capabilities. She brings 20+ years in analytics, security, and cloud solutions marketing at industry leaders such as Cisco, McAfee, and VMware. Teresa focuses on helping customers achieve new levels of innovation and revenue with data. On the Actian blog, Teresa highlights the value of analytics-driven solutions in multiple verticals. Check her posts for real-world transformation stories.
Data Intelligence

Data Fragmentation and How to Overcome It

Actian Corporation

February 10, 2022

data-fragmentation

Data-driven companies do everything they can to efficiently collect and exploit data. But if they are not careful, they risk exposing themselves to a major risk: data fragmentation. In this article, we will go over this threat.

What is Data Fragmentation?

Data fragmentation refers to the dispersion of an organization’s data assets.

This is mainly due to the creation of technological silos and the scattering of data. The more data you have from different sources and stored in different spaces, the more likely it is to be scattered. When data is scattered, it is particularly difficult to get a comprehensive view of the available data assets, especially to reconcile them.

To meet the challenges of digital transformation, companies have to gradually evolve their strategy. And because the volume of data that businesses generate is literally exploding, most organizations have opted for private, public, or hybrid clouds. The diversification of information storage naturally has a perverse effect: data siloing. This siloing may prevent companies from having global visibility on information and may lead them to make wrong decisions.

Challenges Related to Data Fragmentation

Fighting against data fragmentation must be a priority for several reasons.

First of all, data fragmentation degrades the project of developing a true data culture in a company.

Secondly, data fragmentation indirectly distorts the knowledge enterprises have on their customers, products, or ecosystems because it limits their field of vision. Moreover, data fragmentation strongly impacts storage costs: keeping large volumes of data that are poorly or not exploited is quite costly.

Finally, data fragmentation exposes companies to another major risk: with the proliferation of data from various sources, fragmented and unstructured data multiplies.

If left unchecked, the management of this data can affect business operations, slow down data processes, or worse, increase the risks associated with sensitive data.

Fragmented data can sometimes escape data governance and security strategies, consequences that also increase exposure to data breaches. But data fragmentation can be avoided.

What are the Key Steps to Avoid Data Fragmentation?

Is your company ready to start the fight against data fragmentation? 

To start, it is essential to have precise knowledge of all the data available in the organization. To do this, you need to map all of your data assets. Then, you will have to rely on data backup, archiving and exploration solutions gathered within a unique platform. These solutions will give you a global view of all your data, wherever it is stored.

Combined with the vision of a Data Architect, you can then put your data in order and at the same time restructure your data storage in the cloud. 

Finally: to combat data fragmentation, you’ll need to ensure continuous vigilance over all your data.

 
actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Analytics

Engineered Decision Intelligence: The Best Way Forward Part 2

Teresa Wingfield

February 3, 2022

decision intelligence

Part 2: You Need Composable Data and Analytics

In my first blog on engineered decision intelligence, I shared information about what this concept means and why you need it, and then I elaborated on the Gartner recommendation for pairing decision intelligence tools with a common data fabric. But there was a second piece of advice from Gartner: You will need composable data and analytics. That’s the subject I’m covering this time around.

What are Composable Data and Analytics?

Composability is all about using components that work together even though they come from a variety of data, analytics, and AI solutions.* By combining components, according to Garner, you can create a flexible, user-friendly, and user-tailored experience. Many types of analytical tools exist, and the purpose and value delivered by each vary greatly. Composability enables you to assemble their results to gain new, powerful insights.

4 Ways a Modern Data Warehouse Can Better Support Composable Data and Analytics

A modern data warehouse should provide a platform that can empower all the different users in the enterprise to analyze anything, anywhere, anytime, using whatever combination of components they want to use. Here are a few “arrangement” tips.

1. Extend the Data Warehouse With Transactional and Edge Data Processing Capabilities

Historically, there was a clear distinction between a transactional database and a data warehouse. A transactional database tracks and processes business transactions. A data warehouse, in contrast, analyzes historical data. However, modern needs for real-time insights have brought these formerly distinct worlds ever closer together, to the point where, today, there is a strong demand for mixed workloads that combine transactional processing and analytics. You see this in a range of use cases, from automated personalized e-commerce offers and real-time quotes for insurance to credit approval and portfolio management, to name just a few.

Likewise, decision makers are looking for ways to act faster using data from their billions of connected mobile and Internet of Things (IoT) devices. Predictive maintenance, real-time inventory management, production efficiency, and service delivery are just a few of the many areas where real-time analytics on IoT data can help a company cut costs and drive additional revenues.

Real-time transactional analytics and artificial intelligence-enabled insights from IoT data are likely to play increasingly important roles in many organizations. What we’re seeing today is just the beginning of benefit streams to come. Realizing greater benefits will depend upon an organization’s ability to deliver varied data to decision intelligence solutions.

2. Bring in Any Data Source, Anytime

The real-time needs of engineered decision intelligence mean that analytic tools can no longer rely solely on historical data for insights. Decision makers still want on-demand access to data from traditional batch processing sources, but they also want the ability to act on current trends and real-time behaviors. This requires seamless orchestration, scheduling, and management of real-time streaming data from systems throughout the organization and the Internet that are continuously generating it.

In a world that is evolving, data must be available for analysis regardless of where it lives. Since most companies have some combination of cloud and on-premises applications, the data warehouse needs to integrate with systems in both environments. It also needs to be able to work with any type of data in the environment. Business decision-makers that can gain insights from the real-time analysis of both semi-structured and unstructured data, for example, may be able to seize opportunities more efficiently and increase the probability that strategic initiatives will be successful.**

3. Take Advantage of the Efficiencies Enabled by Containerization

A containerized approach makes analytics capabilities more composable so that they can be more flexibly combined into applications. However, this is more advantageous if the data warehouse architecture itself supports containers. Support is key to enabling an organization to meet the resource demands associated with artificial intelligence, machine learning, streaming analytics, and other resource-intensive decision intelligence processing. These workloads strain legacy data warehouse architectures.

Container deployment represents a more portable and resource efficient way to virtualize compute infrastructure versus virtualized deployment.  Because containers virtualize the operating system rather than the underlying hardware, applications require fewer virtual machines and operating systems to run them.

4. Accommodate Any Tool

It’s all well and good if a data warehouse offers its own analytical tools—as long as it can easily accommodate any other tool you might want to use. As I mentioned at the start, the purpose and value delivered by different types of analytical tools vary greatly, and different users—including data engineers, data scientists, business analysts, and business users—need different tools. Look for the flexibility to integrate decision intelligence easily with the data warehouse. Or, if you have unique requirements that require you to build custom applications, look at the development tools the platform supports so that you can achieve the composability that a modern analytics environment requires.

Learn More

If you have found this subject interesting, you may want to check out some of these blogs related to the benefits you can derive from broader decision intelligence composability:

  

* Gartner Top 10 Data and Analytics Trends for 2021
** Semi-structured data is information that does not reside in a relational database but that has some organizational properties that make it easier to analyze (such as XML data). Unstructured data either is not organized in a predefined manner or does not have a predefined data model (examples include Word, PDF, and text files, as well as media logs).

This article was co-authored by Lewis Carr.

Lewis Carr co-author of article on Decision Intelligence

Senior strategic vertical industries, horizontal solutions, product marketing, product management, and business development professional focused on Enterprise software, including Data Management and Analytics, Mobile and IoT, and distributed Cloud computing.

teresa user avatar

About Teresa Wingfield

Teresa Wingfield is Director of Product Marketing at Actian, driving awareness of the Actian Data Platform's integration, management, and analytics capabilities. She brings 20+ years in analytics, security, and cloud solutions marketing at industry leaders such as Cisco, McAfee, and VMware. Teresa focuses on helping customers achieve new levels of innovation and revenue with data. On the Actian blog, Teresa highlights the value of analytics-driven solutions in multiple verticals. Check her posts for real-world transformation stories.
Data Intelligence

What is the Difference Between a Data Architect and a Data Engineer?

Actian Corporation

January 31, 2022

data-architect-data-engineer

The growing importance of data in organizations undergoing digital transformation is redefining the roles and missions of data-driven people within the organization. Among these key profiles are the Data Architect and the Data Engineer. For most people, both of these functions are unclear: although their roles can seem quite similar, their purposes and missions are quite different. 

Because enhancing data is a complex task, organizations must work with the right people: specialists who can create a data-driven culture. It is recommended to hire a Data Architect and a Data Engineer within the data department. Although these two key roles overlap and often lead to confusion, they each fulfill different missions. To know whether or not you should hire a Data Architect or a Data Engineer (or both), it is important to understand their scopes of work to create data synergy.

The Wide Range of Skills of a Data Architect

A Data Architect’s main mission is to organize all the data available within the organization. To do so, they must be able to not only identify and map the data but also prioritize it according to its value, volume, and criticality. Researching, identifying, mapping, prioritizing, segmenting data…the work of a Data Architect is complex and these profiles are particularly sought after. And for good reason. Once this inventory of data has been completed, the Data Architect can define a master plan to rationalize the organization of the data.

A Data Architect intervenes in the first phases of a data project and must therefore lay the foundations for exploiting data in a company. As such, they are an essential link in the value chain of your data teams. Their work is then used by data analysts, data scientists, and, ultimately, by all your employees.

What are the Essential Skills of a Data Engineer?

A Data Engineer follows a Data Architect in this vast task of creating the framework for researching and retrieving data. How do they do this? With their ability to understand and decipher the strengths and weaknesses of the organization’s data sources. As a true field player, they are a key to identifying enterprise-wide data assets. Highly qualified, a Data Engineer is an essential part of a data-driven project.

If a Data Architect designs the organization of the data, the Data Engineer ensures its management, the respect of good practices in the processing, the modeling and storage on a daily basis. Within the framework of their missions, a Data Engineer must constantly ensure that all of the processes linked to the exploitation of data in an organization are fluid. In other words, a Data Engineer guarantees the quality and relevance of the data, while using the framework defined by the Data Architect with whom they must act in concert.

Data Architect vs. Data Engineer: Similar…but Above All, Complementary

A Data Architect and a Data Engineer often follow similar training and have comparable skills in IT development and data exploitation. However, a Data Architect, with their experience in database technology, brings a different value to your data project. With more conceptual contributions, a Data Architect needs to rely on the concrete vision of a Data Engineer. The combination of these key profiles will allow you fully exploit enterprise data. Indeed, a Data Architect and a Data Engineer work together to conceptualize, visualize, and build a framework for managing data.

This perfect duo will allow any organization to maximize its data projects success and above all, create conditions for a sustainable, rational and ROI-driven exploitation of your data.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Management

What’s an Edge Data Fabric?

Actian Corporation

January 31, 2022

edge data fabric

What’s an Edge Data Fabric?

A data fabric is a data architecture, management practices, and policies to deliver a set of data services that span all these domains and endpoints. Data fabrics provide that framework. They essentially serve as both the translator and the plumbing for data in all its forms, wherever it sits and wherever it needs to go, regardless of whether the data consumer is a human or machine.

Data fabrics aren’t brand new, but they are suddenly getting a lot of attention in IT these days as companies move to multi-cloud and the edge. That’s because organizations desperately need a framework to manage it – to move it, secure it, prepare it, govern it, and integrate it into IT systems.

Data fabrics got their start back in the mid-2000s when computing started to spread from data centers into the cloud. They became more popular as organizations embraced hybrid clouds, and today data fabrics are helping to reduce complexities involving data streams moving to and from the network’s edge. But the goalposts have moved, the network’s edge is now the IoT, collectively labeled “the edge.”

What’s different is where the data will emanate from and how fluid it will be. In other words, mobile and IoT – the edge – will drive data creation. Further, the processing and analysis will happen at various points from on the device, at the gateways, and across the cloud. Perhaps a better term would be Fluid Distributed Data instead of Big Data?

Regardless, more data ultimately translates to more viable business opportunities – particularly given that this new data is generated at the point of action from humans and machines. To take full advantage of the growing amounts of data available to them, enterprises need a way to manage it more efficiently across platforms, from the edge to the cloud and back. They need to process, store, and optimize different types of data that come from different sources with different levels of cleanliness and validity so they can connect it to internal applications and apply business process logic, increasingly aided by artificial intelligence and machine learning models.

It’s a big challenge. One solution enterprises are pursuing now is the adoption of a data fabric. And, as data volumes continue to grow at the network’s edge, that solution will evolve further into what will more commonly be referred to as an edge data fabric.

How Data Fabric Applies to the Edge

Edge computing provides a unique set of challenges for data being generated and processed outside the network core. The devices themselves operating at the edge are getting more complex. Smart devices like networked PLCs manage solenoids that, in turn, control process flows in a chemical plant, pressure sensors that determine the weight and active RFID tags to determine the location of a cargo container. The vast majority of the processing used to take place in the data center, but that has shifted to the point where a larger portion of the processing takes place in the cloud. In both cases, the processing happens on one side of a gateway. The data center was fixed, not virtual, but the cloud is fluid. If you consider the definition of cloud, you can see why a data fabric would be needed in it. Cloud is about fluidity and removing locality, but, like the data center, it’s about processing data associated with applications. We may not care where the Salesforce cloud or Oracle cloud or any other cloud is actually located but we do care that my data must transit between various clouds and persist in each of them for use in different operations.

Because of all that complexity, organizations have to determine which pieces of the processing are done at which level. There’s an application for each, and for each application there’s a manipulation. And for each manipulation, there’s processing of data and memory management.

The point of a data fabric is to handle all the complexity. Spark, for example, would be a key element of a data fabric in the cloud, as it quickly has become the easiest way to support streaming data between various cloud platforms from different vendors. The edge is quickly becoming a new cloud, leveraging the same cloud technologies and standards in combination with new, edge-specific networks such as 5G and WLAN 6. And, like the core cloud, there are richer, more intelligent applications running on each device, on gateways, and at what would have been the equivalent of data center running in a coat closet on the factory floor, in an airplane, on a cargo ship and so forth. It stands to reason you will need an analogous edge data fabric to the one that is solidifying in the core cloud.

Edge Data Fabric’s Common Elements

To handle the growing number of data requirements edge devices pose, an edge data fabric has to perform several important functions. It has to be able to:

  • Access to many different interfaces: http, mttp, radio networks, manufacturing networks.
  • Run on multiple operating environments: Most importantly POSIX compliant.
  • Work with key protocols and APIs: Including more recent ones with REST API.
  • Provide JDBC/ODBC database connectivity: For legacy applications and a quick and dirty connection between databases.
  • Handle streaming data: Through standards such as Spark and Kafka.

Conclusion

Data fabric is not a single product, platform, or set of services and neither is edge data fabric. Edge data fabric is an extension of data fabric but, given the differences in resources and requirements at the edge, sufficient change to what is necessary to manage edge data is required. In the next blog we’ll discuss why edge data fabric matters and why now.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.