As businesses work to reopen and students return, in person, to schools nationwide, track and trace is a critical component to mitigate potential resurgences in COVID-19 cases. The more densely populated a city, the harder track and trace will be. However, getting urban and suburban areas moving again is critical to jumpstarting the US economy.

Track and trace is a tall order for several reasons, including public resistance – from wearing masks to social distancing – to allowing third-party surveillance of location and identity through mobile devices. Furthermore, the question of how to prioritize where track and trace should be concentrated and how and when to leverage automation versus human investigators must be answered.

A combination of IoT, community-oriented, psychology-based messaging, and big data analytics could be the key to successfully reopening the economy.

With a Little Help From My Friends

No matter how many tracers are pressed into service, the task will be insurmountable without a behavior change, driven by a change in mindset. There are current and past programs that provide lessons, guidance and proof that behavior can be changed. For example, anti-smoking campaigns were very segmented, with different messaging for underage smokers versus adults. In both cases, the point was to use social expectations as a means of changing behavior. In the case of anti-drunk driving, ad campaigns were focused on peer pressure and provided a positive behavior recommendation – a designated driver – alongside the consequences of driving drunk.

These programs have been successful, but it has taken years – and in some cases, decades – for society to adapt to new norms and expectations. The key is to leverage as many different channels as possible, using as many unique messages and mechanisms as possible, combined with big data analytics to determine what is and isn’t working.

Mayors, city councils, county commissions and other government institutions will need to look to local organizations, such as travel and tourism, public health information, licensing and inspection, 311 and city services portals to help spread messaging on social distancing and mask-wearing. They’ll also need to leverage existing and new communication channels, such as anonymous tip lines through designated Snapchat and Instagram profiles, online chatbots, and toll-free phone calls, to ensure the public’s cooperation, as well as their willingness to opt-in to automated programs for track and trace.

This is exactly where big data analytics can step in, if the proper data science and underlying data warehouse can be put into place quickly. In most cases, the need to rapidly ingest the proper data across multiple city services and their web click streams, transactions, conversations, and other communications that will need analysis is best stood up on a cloud platform.

IoT: The Front-End Piece to the Puzzle

For those individuals that opt-in to having their mobile devices tracked, local governments will need to work with wireless service providers to support tracking location and (masked) identities as they move from one cell tower to the next, while also using cell towers to triangulate position. Many wireless service providers already have network analytics around per-call measurement data that can be repurposed for this, but cell tower information is only part of the equation.

Identifying locations, location conditions and who is at said location based on their cell phone is the other, larger part of the equation. However, cities have some IoT infrastructure in place that can be leveraged in support of track and trace programs. For instance, existing video surveillance cameras can be leveraged to evaluate social distancing and mask wearing through facial and movement detection algorithms. This data can also be used in conjunction with network analytics data to review footage for those who have come in contact with someone who has COVID, confirming if they had a mask on and for how long, and how far they were inside the 6-foot perimeter.

Adding inexpensive IoT solutions can further improve outcome analysis for opt-ins to track and trace and stop-the-spread compliance programs. For example, on trains and buses, seat separation can be monitored either by pressure monitors or LEDs with RF signaling to a local Raspberry Pi in the bus or train car. This then maps out how densely packed seating is and whether or not people are adhering to seating guidelines. This could also be applied to classrooms, movie theaters, and taxi/ridesharing services.

The problem is pressing, but the use of IoT devices and data gives us options and a way to act quickly. The combination of big data analytics and IoT to support comprehensive smart programs gives local governments the chance to get their cities moving again, and their local economies, while better avoiding potential resurgences of COVID-19 spikes this fall.


Blog | Data Intelligence | | 9 min read

A Smart Data Catalog, a Must-Have for Data Leaders

smart data catalogs

The term “smart data catalog” has become a buzzword over the past few months. However, when referring to something being “smart,” most people automatically think, and rightly so, of a data catalog with only machine learning capabilities.

We do not believe that a smart data catalog is reduced to only having ML features.

There are many different ways to be “smart”. This article focuses on the conference that Guillaume Bodet gave at the Data Innovation Summit 2020: “Smart Data Catalogs, A Must-Have for Leaders”.

A Quick Definition of Data Catalog

We define a data catalog as being:

A detailed inventory of all data assets in an organization and their metadata, designed to help data professionals quickly find the most appropriate data for any analytical business purpose.

A data catalog is meant to serve different people or end-users. All of these end-users have different expectations, needs, profiles, and ways to understand data. These end-users consist of data analysts, data stewards, data scientists, business analysts, and so much more. As more and more people are using and working with data, a data catalog must be smart for all end-users.

What Does a “Data Asset” Refer to?

An asset, financially speaking, typically appears in the balance sheet with an estimation of value. When referring to data assets, it is just as important, even more important in some cases, than other enterprise assets. The issue is that the value for data assets aren’t always known.

However, there are many ways to tap the value of your data. There is the possibility for enterprises to directly use their data’s value, like for example selling or trading their data. Many organizations do this; they clean the data, structure it, and then proceed to sell it.

Enterprises can also make value indirectly from their data. Data assets enable organizations to:

  • Innovate for new products/services.
  • Improve overall performance.
  • Improve product positioning.
  • Better understand markets/customers.
  • Increase operational efficiency.

High performing enterprises are those that master their data landscape and exploit their data assets in every aspect of their activity.

The Hard Things About Data Catalogs

When your enterprise deals with thousands of data, that usually means you are possibly dealing with:

  • 100s of systems that store internal data (data warehouses, applications, data lakes, datastores, APIs, etc) as well as external data from partners.
  • 1,000s of datasets, models, and visualizations (data assets) that are composed of thousands of fields.
  • And these fields contain millions of attributes (or metadata)!

Not to mention the hundreds of users using them…

This raises two different questions:

How can I build, maintain, and enforce the quality of my information for my end-users to trust in my catalog?

How can I quickly find data assets for specific use cases?

The answer is in smart data catalogs

We believe that are five core areas of “smartness” for a data catalog. It must be smart in its:

  • Design: The way users explore the catalog and consume information.
  • User Experience: How it adapts to different profiles.
  • Inventories: Provides a smart and automatic way of inventorying.
  • Search Engine: Supports the different expectations and gives smart suggestions.
  • Metadata management: A catalog that tags and links data together through ML features.

Let’s go into detail for each of these areas:

A Smart Design

Knowledge Graph

A data catalog with smart design uses knowledge graphs rather than static ontologies (a way to classify information, most of the time built as a hierarchy).  The problem with ontologies is that they are very hard to build and maintain, and usually only certain types of profiles truly understand the various classifications.

A knowledge graph on the other hand, is what represents different concepts in a data catalog and what links objects together through semantic or static links. The idea of a knowledge graph is to build a network of objects, and more importantly, create semantic or functional relationships between the different assets in your catalog.

Basically, a smart data catalog provides users with a way to find and understand related objects.

Adaptive Metamodels

In a data catalog, users will find hundreds of different properties, to which aren’t relevant to some users. Typically, two types of information are managed:

  • Entities: Plain objects, glossary entries, definitions, models, policies, descriptions, etc.
  • Properties: The attributes that you put on the entities (any additional information such as create date, last updated date, etc.)

The design of the metamodel must serve the data consumer. It needs to be adapted to new business cases and must be simple enough to manage for users to maintain and understand it. Bonus points if it is easy to create new types of objects and sets of attributes!

Semantic Attributes

Most of the time, in a data catalog, the metamodel’s attributes are technical properties. Some of the attributes on an object include generic types such as text, number, date, list of values, and so on. As this information is necessary to have, it is not completely sufficient because they do not have information on the semantics, or meaning. The reason this is important is because with this information, the catalog can adapt the visualization of the attribute and improve suggestions to users.

In conclusion, there is one size fits all to a data catalog’s design, and it must evolve in time to support new data areas and use cases.

A Smart User Experience

As stated above, a data catalog holds a lot of information and end-users often struggle to find the information of interest to them. Expectations differ between profiles. A data scientist will expect statistical information, whereas a compliance officer expects information on various regulatory policies.

With smart and adaptive user experience, a data catalog will present the most relevant information to specific end-users. Information hierarchy and adjusted search results in a smart data catalog is based on:

  • Static Preferences: Already known in the data catalog if the profile is more focused on data science, IT, etc.
  • Dynamic Profiling: To learn what the end-user usually searches, their interests, and how they’ve used the catalog in the past.

A Smart Inventory System

A data catalog’s adoption is built on trust and trust can only come if its content is accurate. As the data landscape moves at a fast pace, it must be connected to operational systems to maintain the first level of information on metadata on your data assets.

The catalog must synchronize its content with the actual content of the operational systems.

A catalog’s typical architecture is to have scanners that scan your operational systems and bring and synchronize information from various sources (Big Data, noSQL, Cloud, Data Warehouse, etc.). The idea is to have universal connectivity so enterprises can scan any type of system automatically and set them in the knowledge graph.

In the Actian Data Intelligence Platform, there is an automation layer to bring back the information from the systems to the catalog. It can:

  • Update assets to reflect physical changes.
  • Detect deleted or moved assets.
  • Resolve links between objects.
  • Apply rules to select the appropriate set of attributes and define attribute values.

 A Smart Search Engine

In a data catalog, the search engine is one of the most important features. We distinguish between two kinds of searches:

  • High Intent Search: The end-user already knows what they are looking for and has precise information on their query. They either already have the name of the dataset or already know where it is found. Low intent searches are commonly used by more data savvy people.
  • Low Intent Search: The end-user isn’t exactly sure what they are looking for, but want to discover what they could use for their context. Searches are made through keywords and users expect the most relevant results to appear.

 A smart data catalog must support both types of searches

It must also provide smart filtering. It is a necessary complement to the user’s search experience (especially low intent research), allowing them to narrow their search results by excluding attributes that aren’t relevant. Just like many big companies like Google, Booking.com, and Amazon, the filtering options must be adapted to the content of the search and the user’s profile in order for the most pertinent results to appear.

Smart Metadata Management

Smart metadata management is usually what we call the “augmented data catalog”, the catalog that has machine learning capabilities that will enable it to detect certain types of data, apply tags, or statistical rules on data.

A way to make metadata management smart is to apply data pattern recognition. Data pattern recognition refers to being able to identify similar assets and rely on statistical algorithms and ML capabilities that are derived from other pattern recognition systems.

This data pattern recognition system helps data stewards set their metadata:

  • Identify duplicates and copy metadata.
  • Detect logical data types (emails, city, addresses, and so on).
  • Suggest attribute values (recognize documentation patterns to apply to a similar object or a new one).
  • Suggest links – semantic or lineage links.
  • Detect potential errors to help improve the catalog’s quality and relevance.

It also helps data consumers find their assets. The idea is to use some techniques that are derived from content-based recommendations found in general-purpose catalogs. When the user has found something, the catalog will suggest alternatives based both on their profile and pattern recognition.

Start Your Data Catalog Journey

Actian Data Intelligence Platform is a 100% cloud-based solution, available anywhere in the world with just a few clicks. By choosing the Actian Data Intelligence Platform Data Catalog, control the costs associated with implementing and maintaining a data catalog while simplifying access for your teams.

The automatic feeding mechanisms, as well as the suggestion and correction algorithms, reduce the overall costs of a catalog, and guarantee your data teams with quality information in record time.


Summary

This blog explains enterprise data integration (EDI) — the strategic process of merging data across business units, partners, or during mergers—to create unified, scalable, secure, and insightful data ecosystems that support digital transformation and executive decision-making.

  • Unify siloed data for better insights: By integrating data from disparate sources—on‑premises, cloud, mobile, IoT—an organization gains a centralized, real-time view that enhances analytics, reporting, and C-level dashboards.
  • Boost agility, security & scalability with IPaaS: Modern iPaaS platforms (like Actian DataConnect) support flexible, scalable, and secure integration across hybrid environments—avoiding the high maintenance costs of point-to-point solutions.
  • Enable AI, BI & digital transformation: A robust EDI foundation ensures data availability and quality, empowering AI systems, enterprise BI tools, and digital workflows to drive faster, data‑driven decisions.

Enterprise data integration is the merging of data across two or more organizations. This scenario is most commonly found when companies are going through mergers or acquisitions, and data from the two companies needs to be brought together. Other scenarios for enterprise data integration are joint partnerships (where two or more companies work together under the umbrella of a shared business entity) and integration across different business units within an enterprise conglomerate. In any of these scenarios, effective enterprise data integration requires a combination of organizational politics, effective data architecture, and scalable data integration techniques.

This has become a very important topic for many business and IT executives over the past few years as companies go through the digital transformation of their business processes, leverage virtualized supply chains for delivery, and executive decision-makers become more comfortable with leveraging analytics and data to inform decision-making. Siloed organizational structures are being collapsed; siloed business processes are being rationalized and modernized; fragmented IT systems are being rationalized, and siloed data needs to be integrated to provide an all-up view of the enterprise.

Why Enterprise Data Integration is Important

Organizations that fail to implement an enterprise data integration strategy effectively will find it difficult to reap sustainable value from digital transformation initiatives.  Modernized business processes rely on integrated data for efficiencies and as an anchor point for rich user experiences.  Artificial intelligence systems that are being deployed to provide next-generation customer experiences are driven by data.  If artificial intelligence (AI) does not have access to data from across the enterprise, it will be severely limited in the value it can provide.

Executive decision-makers are increasingly looking to enterprise dashboarding solutions like Tableau and Microsoft Power BI to provide them with all-up views across their organizations.  These enterprise business intelligence platforms are intended to provide integrated dashboards that span different source systems – obscuring the complexity introduced by IT system implementations and enabling the executive to view data in business terms.  Executives don’t want to go to multiple dashboards and reporting systems; they want all their data in one place, integrated, organized, and curated into views that they can easily understand.   If they have these views, they can quickly identify issues needing their attention with supporting information so they can make responsive decisions.  Without integrated enterprise data, the executive’s ability to act is compromised, and enterprise business agility suffers.

Why You Need an IPaaS Solution to Enable Enterprise Data Integration

Yes, you can achieve enterprise data integration through point-to-point connections – but you are going to spend more time and money than you need to achieve the desired outcomes, and it is going to cost you more to operate and maintain over time. If you want to be agile, if you want your data to be secure, if you want your solution to be scalable and flexible – you need to build your enterprise data integration on an IPaaS (Integration Platform as a Service) foundation.

An IPaaS solution provides three key characteristics that are critical for achieving a successful, sustainable enterprise data integration capability.

  1. Flexibility – Ability to connect anything, anytime, anywhere. Your enterprise is diverse, and if you are connecting data across organizations, flexibility is even more important. On-prem, in the cloud, mobile devices, deployed infrastructure, IoT – IPaaS supports it all.  You need the ability to integrate anything because, eventually, you will want to integrate everything.
  2. Scalability – As your business grows and you modernize your IT systems, your data footprint will grow exponentially. Your IPaaS solution needs to be able to scale to support this growth without sacrificing performance or overburdening you with overhead costs.
  3. Secure Management – Information Security is top of mind for modern companies. An IPaaS solution provides a centralized place to manage integrations, source system credentials, and monitor the flow of data across your organization.  IPaaS can help you lower your infosec risk.

Actian DataConnect is a leading IPaaS solution that provides the technology platform you’ll need to achieve your enterprise data integration objectives.  Through a highly scalable hybrid deployment model, robust integration design capabilities, and automated deployment capabilities – DataConnect can help you deliver more effectively and faster than other solutions. To learn more, visit DataConnect.