Skip to content
  • HCLSoftware
  • Support
  • Community
  • Downloads
  • Documentation
  • Login
A graphic logo of the Actian Company A graphic logo of the Actian Company
  • Products Products
    • blue data icon for Actian

      Data + AI Intelligence

      Actian Data Intelligence Platform New
      Cloud-native SaaS solution that democratizes data access and accelerates your data-driven initiatives
      Actian Data Platform
      Easily connect, manage, and analyze data with a unified platform
    • blue Database icon for Actian

      Databases

      • Zen
        Low-maintenance embedded database
      • Actian NoSQL
        Databases for complex object networks
      • Actian Ingres
        Scalable and reliable transactional processing
      • HCL Informix®
        Fast, cost-optimized enterprise data management
    • blue line chart icon for Actian

      Analytics

      • Vector
        High performance, vectorized columnar analytics database
    • integrations

      Data Management

      • DataConnect
        Highly scalable hybrid integration solution
      • Data Quality
        Make informed decisions that drive your business forward
      • DataFlow
        Parallel execution platform data-in-motion
    • Bar Chart blue icon

      App Modernization

      • OpenROAD
        Database, object-oriented rapid app development
      • VoltMX
        Low code app development platform
    • See All Actian Products
    • blue square with right arrow pointing up

      Deployment

      Deployment

      Cloud, hybrid and on-premises

      • Google Cloud Launch your journey to Google with Actian
      • Amazon Web Services Launch your journey to AWS with Actian
      • Microsoft Azure Launch your journey to Azure with Actian
    See All Actian Products Explore All Deployment Partners
  • Solutions Solutions

    Solutions by Industry

    • Manufacturing
    • Transportation and Logistics
    • Banking, Financial Services, and Insurance
    • Healthcare and Life Sciences
    • Retail and Consumer Goods
    • Energy and Utilities

    Use Cases

    • Gen AI icon
      GenAI Data Readiness A quick checklist to evaluate your GenAI readiness
    • blue layer icon for Actian
      Flexible Data Integration Collect, transform, and automate data pipelines
    • database blue icon
      Data Warehouse Modernization Low-risk, simplified migration to a modern data warehouse deployed on-premises and in multiple clouds
    • blue communications solutions icon
      Enterprise Data Marketplace Discover, access, and share data products
    • blue cloud icon for Actian
      Edge-to-Cloud Analytics Modernize application data processing and analytics at the Edge
    • blue dataflow chart icon for Actian
      Customer Data Analytics Hub Get real-time actionable customer intelligence across all your customer experience data silos
    Explore All Industry Solutions
  • Customers Customers

    Customers

    • blue users icon for actian
      Our Customers Join a growing community of businesses across diverse industries who trust Actian to unlock the power of their data

    Featured Customer Stories

    • blue user icon for actian
      Academy Bank
    • blue user icon for actian
      Tsubakimoto
    View All Customers
  • Partners Partners

    Partners

    • blue info square icon for Actian
      Program Overview Competitive solutions, industry-leading incentives and a comprehensive support package
    • blue check icon for Actian
      Become a Partner Accelerate your business with the Actian Partner Program
    • blue Bezier Icon for Actian
      Technology Partners Partnering to create a force multiplier
    • blue user icon for actian
      Refer a Lead Protect your customer, grow your business
    • Find a partner icon
      Find a Partner Leverage expertise and insights from our partner network
  • Learn Learn

    Learn

    • Image Indent Left Icon
      Blog
    • graduation hat blue icon
      Actian Academy
    • book blue icon
      Resources
    • blue icon with paper and magnifying glass for Actian
      Guides
    • blue square
      Webinars
    • blue list logo
      Glossary
    View All Resources
  • Company Company

    Company

    • blue Actian logo
      About Us
    • announcement blue icon
      Newsroom
    • question blue icon
      About HCLSoftware
    • blue briefcase icon for Actian
      Careers
    • blue users icon Actian
      Leadership
    • blue check icon for Actian
      Awards and Recognition
    • Calendar blue icon
      Events
    • message blue icon
      Contact Us
    Learn More About Actian
Take a Tour Request Demo Login
  • Support
  • Community
  • Downloads
  • Documentation
  • HCLSoftware
Learn more about our data solutions
Contact Us
Data Intelligence

A Smart Data Catalog, a Must-Have for Data Leaders

Actian Corporation

August 26, 2020

smart data catalogs

Latest Blog Posts

Keep up with the latest data trends

Subscribe

The term “smart data catalog” has become a buzzword over the past few months. However, when referring to something being “smart” most people automatically think, and rightly so, of a data catalog with only Machine Learning capabilities.

We do not believe that a smart data catalog is reduced to only having ML features.

In fact, there are many different ways to be “smart”. This article focuses on the conference that Guillaume Bodet, co-founder and CEO of Zeenea, gave at the Data Innovation Summit 2020: “Smart Data Catalogs, A Must-Have for Leaders”.

A Quick Definition of Data Catalog

We define a data catalog as being:

A detailed inventory of all data assets in an organization and their metadata, designed to help data professionals quickly find the most appropriate data for any analytical business purpose.

A data catalog is meant to serve different people or end-users. All of these end-users have different expectations, needs, profiles, and ways to understand data. These end-users consist of data analysts, data stewards, data scientists, business analysts, and so much more. As more and more people are using and working with data, a data catalog must be smart for all end-users.

What Does a “Data Asset” Refer to?

An asset, financially speaking, typically appears in the balance sheet with an estimation of value. When referring to data assets, it is just as important, even more important in some cases, than other enterprise assets. The issue is that the value for data assets aren’t always known.

However, there are many ways to tap the value of your data. There is the possibility for enterprises to directly use their data’s value, like for example selling or trading their data. Many organizations do this; they clean the data, structure it, and then proceed to sell it.

Enterprises can also make value indirectly from their data. Data assets enable organizations to:

  • Innovate for new products/services.
  • Improve overall performance.
  • Improve product positioning.
  • Better understand markets/customers.
  • Increase operational efficiency.

High performing enterprises are those that master their data landscape and exploit their data assets in every aspect of their activity.

The Hard Things About Data Catalogs

When your enterprise deals with thousands of data, that usually means you are possibly dealing with:

  • 100s of systems that store internal data (data warehouses, applications, data lakes, datastores, APIs, etc) as well as external data from partners.
  • 1,000s of datasets, models, and visualizations (data assets) that are composed of thousands of fields.
  • And these fields contain millions of attributes (or metadata)!

Not to mention the hundreds of users using them…

This raises two different questions:

How can I build, maintain, and enforce the quality of my information for my end-users to trust in my catalog?

How can I quickly find data assets for specific use cases?

The answer is in smart data catalogs

We believe that are five core areas of “smartness” for a data catalog. It must be smart in its:

  • Design: The way users explore the catalog and consume information.
  • User Experience: How it adapts to different profiles.
  • Inventories: Provides a smart and automatic way of inventorying.
  • Search Engine: Supports the different expectations and gives smart suggestions.
  • Metadata management: A catalog that tags and links data together through ML features.

Let’s go into detail for each of these areas:

A Smart Design

Knowledge Graph

A data catalog with smart design uses knowledge graphs rather than static ontologies (a way to classify information, most of the time built as a hierarchy).  The problem with ontologies is that they are very hard to build and maintain, and usually only certain types of profiles truly understand the various classifications.

A knowledge graph on the other hand, is what represents different concepts in a data catalog and what links objects together through semantic or static links. The idea of a knowledge graph is to build a network of objects, and more importantly, create semantic or functional relationships between the different assets in your catalog.

Basically, a smart data catalog provides users with a way to find and understand related objects.

Adaptive Metamodels

In a data catalog, users will find hundreds of different properties, to which aren’t relevant to some users. Typically, two types of information are managed:

  • Entities: Plain objects, glossary entries, definitions, models, policies, descriptions, etc.
  • Properties: The attributes that you put on the entities (any additional information such as create date, last updated date, etc.)

The design of the metamodel must serve the data consumer. It needs to be adapted to new business cases and must be simple enough to manage for users to maintain and understand it. Bonus points if it is easy to create new types of objects and sets of attributes!

Semantic Attributes

Most of the time, in a data catalog, the metamodel’s attributes are technical properties. Some of the attributes on an object include generic types such as text, number, date, list of values, and so on. As this information is necessary to have, it is not completely sufficient because they do not have information on the semantics, or meaning. The reason this is important is because with this information, the catalog can adapt the visualization of the attribute and improve suggestions to users.

In conclusion, there is one size fits all to a data catalog’s design, and it must evolve in time to support new data areas and use cases.

A Smart User Experience

As stated above, a data catalog holds a lot of information and end-users often struggle to find the information of interest to them. Expectations differ between profiles. A data scientist will expect statistical information, whereas a compliance officer expects information on various regulatory policies.

With smart and adaptive user experience, a data catalog will present the most relevant information to specific end-users. Information hierarchy and adjusted search results in a smart data catalog is based on:

  • Static Preferences: Already known in the data catalog if the profile is more focused on data science, IT, etc.
  • Dynamic Profiling: To learn what the end-user usually searches, their interests, and how they’ve used the catalog in the past.

A Smart Inventory System

A data catalog’s adoption is built on trust – and trust can only come if its content is accurate. As the data landscape moves at a fast pace, it must be connected to operational systems to maintain the first level of information on metadata on your data assets.

The catalog must synchronize its content with the actual content of the operational systems.

A catalog’s typical architecture is to have scanners that scan your operational systems and bring and synchronize information from various sources (Big Data, noSQL, Cloud, Data Warehouse, etc.). The idea is to have universal connectivity so enterprises can scan any type of system automatically and set them in the knowledge graph.

In Zeenea, there is an automation layer to bring back the information from the systems to the catalog. It can:

  • Update assets to reflect physical changes.
  • Detect deleted or moved assets.
  • Resolve links between objects.
  • Apply rules to select the appropriate set of attributes and define attribute values.

 A Smart Search Engine

In a data catalog, the search engine is one of the most important features. We distinguish between two kinds of searches:

  • High Intent Search: The end-user already knows what they are looking for and has precise information on their query. They either already have the name of the dataset or already know where it is found. Low intent searches are commonly used by more data savvy people.
  • Low Intent Search: The end-user isn’t exactly sure what they are looking for, but want to discover what they could use for their context. Searches are made through keywords and users expect the most relevant results to appear.

 A smart data catalog must support both types of searches

It must also provide smart filtering. It is a necessary complement to the user’s search experience (especially low intent research), allowing them to narrow their search results by excluding attributes that aren’t relevant. Just like many big companies like Google, Booking.com, and Amazon, the filtering options must be adapted to the content of the search and the user’s profile in order for the most pertinent results to appear.

Smart Metadata Management

Smart metadata management is usually what we call the “augmented data catalog”, the catalog that has machine learning capabilities that will enable it to detect certain types of data, apply tags, or statistical rules on data.

A way to make metadata management smart is to apply data pattern recognition. Data pattern recognition refers to being able to identify similar assets and rely on statistical algorithms and ML capabilities that are derived from other pattern recognition systems.

This data pattern recognition system helps data stewards set their metadata:

  • Identify duplicates and copy metadata.
  • Detect logical data types (emails, city, addresses, and so on).
  • Suggest attribute values (recognize documentation patterns to apply to a similar object or a new one).
  • Suggest links – semantic or lineage links.
  • Detect potential errors to help improve the catalog’s quality and relevance.

It also helps data consumers find their assets. The idea is to use some techniques that are derived from content-based recommendations found in general-purpose catalogs. When the user has found something, the catalog will suggest alternatives based both on their profile and pattern recognition.

Start Your Data Catalog Journey With Zeenea

Zeenea is a 100% cloud-based solution, available anywhere in the world with just a few clicks. By choosing Zeenea Data Catalog, control the costs associated with implementing and maintaining a data catalog while simplifying access for your teams.

The automatic feeding mechanisms, as well as the suggestion and correction algorithms, reduce the overall costs of a catalog, and guarantee your data teams with quality information in record time.

actian avatar logo

About Actian Corporation

Actian makes data easy. Our data platform simplifies how people connect, manage, and analyze data across cloud, hybrid, and on-premises environments. With decades of experience in data management and analytics, Actian delivers high-performance solutions that empower businesses to make data-driven decisions. Actian is recognized by leading analysts and has received industry awards for performance and innovation. Our teams share proven use cases at conferences (e.g., Strata Data) and contribute to open-source projects. On the Actian blog, we cover topics ranging from real-time data ingestion, data analytics, data governance, data management, data quality, data intelligence to AI-driven analytics.
  • Data Catalog
  • Metadata Management
  • Share withTwitter Icon
  • Share withLinkedin Icon
  • Share withFacebook Icon
  • Share withMail Icon

Subscribe to the Actian Blog

Subscribe to Actian’s blog to get data insights delivered
right to you.

  • Stay in the know – Get the latest in data analytics pushed directly to your inbox.
  • Never miss a post – You’ll receive automatic email updates to let you know when new posts are live.
  • It’s all up to you – Change your delivery preferences to suit your needs.

Subscribe

This email extension () is not allowed. Please update.
This personal email address domain () is not allowed. Please update.

Thank you for subscribing to the Actian Blog!

Get ready to stay informed and inspired with the latest insights, trends, and updates in the world of data analytics and technology.

Expect our carefully curated articles, case studies, and industry news to land in your inbox soon.

Also of Interest:
  • Data Intelligence for Smarter Decisions
  • Get a 360-Degree Customer View
  • Actian Named a Top Data Quality Vendor

Platforms

  • Actian Data Intelligence Platform
  • Actian Data Platform

Capabilities

  • Data Analytics
  • Databases
  • Data Integration & Quality
  • Application Services

Solutions

  • Manufacturing
  • Financial Services
  • Healthcare Data Analytics
  • Transportation & Logistics
  • Communications

Company

  • About Actian
  • About HCLSoftware
  • Events
  • Awards & Recognition
  • Newsroom
  • Press
  • Careers
  • Locations

Customers

  • Support
  • Community
  • Documentation
  • Customer Portal Login
  • Actian Data Platform Login

Get Started

  • Request Demo
  • Contact Us
Actian
© 2025 Actian Corporation. All Rights Reserved.
  • x social icon
  • facebook
  • Linkedin
  • GitHub
  • youtube
  • Terms of Use
  • Modern Slavery Policy
  • Privacy Policy
  • Trademark Guidelines
  • Patents
  • Security
hcl-logo