Building a Marketplace for Data Mesh: Facilitating Data Product – Part 1
Actian Germany GmbH
May 28, 2024
Over the past decade, data catalogs have emerged as important pillars in the landscape of data-driven initiatives. However, many vendors on the market fall short of expectations with lengthy timelines, complex and costly projects, bureaucratic data governance models, poor user adoption rates, and low-value creation. This discrepancy extends beyond metadata management projects, reflecting a broader failure at the data management level.
Given these shortcomings, a new concept is gaining popularity: the internal marketplace, or what we call the Enterprise Data Marketplace (EDM).
In this series of articles, get an excerpt from our Practical Guide to Data Mesh, where we explain the value of internal data marketplaces for data product production and consumption, how an EDM supports data mesh exploitation on a larger scale, and how they go hand-in-hand with a data catalog solution:
- Facilitating data product consumption through metadata
- Setting up an enterprise-level marketplace
- Feeding the marketplace via domain-specific data catalogs
Before diving into the internal marketplace, let’s quickly go back to the notion of a data product, which we believe is the cornerstone of the data mesh and the first step in transforming data management.
Sharing and Exploiting Data Products Through Metadata
Wie in unserer vorangegangenen Serie über Data Mesh erwähnt, ist ein Datenprodukt ein geregelter, wiederverwendbarer, skalierbar Datensatz, der Datenqualität und die Einhaltung verschiedener Vorschriften und interner Regeln garantiert. Beachten Sie, dass diese Definition recht restriktiv ist – sie schließt andere Arten von Produkten wie Algorithmen, Modelle oder Dashboards des Maschinelles Lernen aus.
While these artifacts should be managed as products, they are not data products. There are other types of products, which could be very generally termed “Analytics Products”, of which data products are one subset.
In practice, an operational data product consists of two things:
- Data – Materialized on a centralized or decentralized data platform, guaranteeing data addressing, interoperability, and access security.
- Metadata – Providing all the necessary information for sharing and using the data.
Metadata ensures consumers have all the information they need to use the product.
It typically covers the following aspects:
- Schema – Providing the technical structure of the data product, data classification, samples, and their origin (lineage).
- Governance – Identifying the product owner(s), its successive versions, its possible deprecation, etc.
- Semantics – Providing a clear definition of the exposed information, ideally linked to the organization’s business glossary and comprehensive documentation of the data product.
- Contract – Defining quality guarantees, consumption modalities (protocols and security), potential usage restrictions, redistribution rules, etc.
In the data mesh logic, these metadata are managed by the product team and are deployed according to the same lifecycle as data and pipelines. There remains a fundamental question: where can metadata be deployed?
Using a Data Marketplace to Deploy Metadata
Most organizations already have a metadata management system, usually in the form of a Data Catalog.
But data catalogs, in their current form, have major drawbacks:
- They don’t always support the notion of a data product – it must be more or less emulated with other concepts.
- They are complex to use – designed to catalog a large number of assets with sometimes very fine granularity, they often suffer from a lack of adoption beyond centralized data management teams.
- They mostly impose a rigid and unique organization of data, decided and designed centrally – which fails to reflect the variety of different domains or the organization’s evolution as the data mesh expands.
- Their search capabilities are often limited, particularly for exploratory aspects – it’s often necessary to know what you’re looking for to be able to find it.
- The experience they offer sometimes lacks the simplicity users aspire to – search with a few keywords, identify the appropriate data product, and then trigger the operational process of an access request or data delivery.
The internal marketplace, or Enterprise Data Marketplace (EDM) is therefore a new concept gaining popularity in the data mesh circle. Like a general-purpose marketplace, the EDM aims to provide a shopping experience for data consumers. It is thus an essential component to ensure the exploitation of the data mesh on a larger scale – it allows data consumers to have a simple and effective system to search for and access data products from various domains.
In our next article, learn the different ways to set up an internal data marketplace, and how it is essential for data mesh exploitation.
Abonnieren Sie den Actian Blog
Abonnieren Sie den Blog von Actian, um direkt Dateneinblicke zu erhalten.
- Bleiben Sie auf dem Laufenden: Holen Sie sich die neuesten Informationen zu Data Analytics direkt in Ihren Posteingang.
- Verpassen Sie keinen Beitrag: Sie erhalten automatische E-Mail-Updates, die Sie informieren, wenn neue Beiträge veröffentlicht werden.
- Ganz wie sie wollen: Ändern Sie Ihre Lieferpräferenzen nach Ihren Bedürfnissen.
Abonnieren
(d.h. sales@..., support@...)