Data Intelligence

The Journey to Data Mesh – Part 3 – Creating Your First Data Products

Actian Corporation

April 22, 2024

While the literature on data mesh is extensive, it often describes a final state, rarely how to achieve it in practice. The question then arises:

What approach should be adopted to transform data management and implement a data mesh?

In this series of articles, get an excerpt from our Practical Guide to Data Mesh where we propose an approach to kick off a data mesh journey in your organization, structured around the four principles of data mesh (domain-oriented decentralized data ownership and architecture, data as a product, self-serve data infrastructure as a platform, and federated computational governance) and leveraging existing human and technological resources.

Throughout this series of articles, and in order to illustrate this approach for building the foundations of a successful data mesh, we will rely on an example: that of the fictional company Premium Offices – a commercial real estate company whose business involves acquiring properties to lease to businesses.

In the initial articles of the series, we’ve identified the domains, defined an initial use case, and assembled the team responsible for its development. Now, it’s time to move on to the second data mesh principle, “data as a product,” by developing the first data products.

The Product-Thinking Approach of the Mesh

Over the past decade, domains have often developed a product culture around their operational capabilities. They offer their products to the rest of the organization as APIs that can be consumed and composed to develop new services and applications. In some organizations, teams strive to provide the best possible experience to developers using their domain APIs: search in a global catalog, comprehensive documentation, code examples, sandbox environments, guaranteed and monitored service levels, etc.

These APIs are then managed as products that are born, evolve over time (without compatibility breaks), enriched, and are eventually deprecated, usually replaced by a newer, more modern, more performant version.

The data mesh proposes to apply this same product-thinking approach to the data shared by the domains.

Data Products Characteristics

In some organizations, this product-oriented culture is already well established. In others, it will need to be developed or introduced. But let’s not be mistaken:

A data product is not a new digital artifact requiring new technical capabilities (like an API Product). It is simply the result of a particular data management approach exposed by a domain to the rest of the organization.

Managing APIs as a product did not require a technological breakthrough: existing middleware did the job just fine. Similarly, data products can be deployed on existing data infrastructures, whatever they may be. Technically, a data product can be a simple file in a data lake with an SQL interface; a small star schema, complemented by a few views facilitating querying, instantiated in a relational database; or even an API, a Kafka stream, an Excel file, etc.

A data product is not defined by how it is materialized but by how it is designed, managed, and governed; and by a set of characteristics allowing its large-scale exploitation within the organization.

These characteristics are often condensed into the acronym DATSIS (Discoverable, Addressable, Trustworthy, Self-describing, Interoperable, Secure).

In addition, obtaining a DATSIS data product does not require significant investments. It involves defining a set of global conventions that domains must follow (naming, supported protocols, access and permission management, quality controls, metadata, etc.). The operational implementation of these conventions usually does not require new technological capabilities – existing solutions are generally sufficient to get started.

An exception, however, is the catalog. It plays a central role in the deployment of the data mesh by allowing domains to publish information about their data products, and consumers to explore, search, understand, and exploit these data products.

Best Practices for Data Product Design

Designing a data product is certainly not an exact science – there could be only one product, or three or four. To guide this choice, it is once again useful to leverage some best practices from distributed architectures – a data product must:

  • Have a single and well-defined responsibility.
  • Have stable interfaces and ensure backward compatibility.
  • Be usable in several different contexts and therefore support polyglotism.

Data Products Developer Experience

Developer experience is also a fundamental aspect of the data mesh, with the ambition to converge the development of data products and the development of services or software components. It’s not just about being friendly to engineers but also about responding to a certain economic rationality:

The decentralization of data management implies that domains have their own resources to develop data products. In many organizations, the centralized data team is not large enough to support distributed teams. To ensure the success of the data mesh, it is essential to be able to draw from the pool of software engineers, which is often larger.

The state of the art in software development relies on a high level of automation: declarative allocation of infrastructure resources, automated unit and integration testing, orchestrated build and deployment via CI/CD tools, Git workflows for source and version management, automatic documentation publishing, etc.

The development of data products should converge toward this state of the art – and depending on the organization’s maturity, its teams, and its technological stack, this convergence will take more or less time. The right approach is to automate as much as possible using existing and mastered tools, then identify operations that are not automated to gradually integrate additional tooling.

In practice, here is what constitutes a data product:

  1. Code first – For pipelines that feed the data product with data from different sources or other data products; for any consumption APIs of the data product; for testing pipelines and controlling data quality; etc.
  2. Data, of course – But most often, the data exists in systems and is simply extracted and transformed by pipelines. Therefore, it is not present in the source code (excluding exceptions).
  3. Metadata – Some of which document the data product: schema, semantics, syntax, quality, lineage, etc. Others are intended to ensure product governance at the mesh scale – contracts, responsibilities, access policies, usage restrictions, etc.
  4. Infrastructure – Or more precisely, the declaration of the physical resources necessary to instantiate the data product: deployment and execution of code, deployment of metadata, resource allocation for storage, etc.

On the infrastructure side, the data mesh does not require new capabilities – the vast majority of organizations already have a data platform. The implementation of the data mesh also does not require a centralized platform. Some companies have already invested in a common platform, and it seems logical to leverage the capabilities of this platform to develop the mesh.But others have several platforms, some entities, or certain domains having their infrastructure. It is entirely possible to deploy the data mesh on these hybrid infrastructures: as long as the data products respect common standards for addressability, interoperability, and access control, the technical modalities of their execution are of little importance.

Premium Offices Example:

To establish an initial framework for the governance of its data mesh, Premium Offices has set the following rules:

  • A data product materializes as a dedicated project in BigQuery – this allows setting access rules at the project level, or more finely if necessary. These projects will be placed in a “data products” directory and a sub-directory bearing the name of the domain to which they belong (in our example, “Brokerage”).
  • Data products must offer views to access data – these views provide a stable consumption interface and potentially allow evolving the internal model of the product without impacting its consumers.
  • All data products must identify data using common references for common data (Clients, Products, Suppliers, Employees, etc.) – this simplifies cross-referencing data from different data products (LEI, product code, UPC, EAN, email address, etc.).
  • Access to data products requires strong authentication based on GCP’s IAM capabilities – using a service account is possible, but each user of a data product must then have a dedicated service account. When access policies depend on users, the end user’s identity must be used via OAuth2 authentication.
  • The norm is to grant access only to views – and not to the internal model.
  • Access requests are processed by the Data Product Owner through workflows established in ServiceNow.
  • DBT is the preferred ETL for implementing pipelines – each data product has a dedicated repository for its pipeline.
  • A data product can be consumed either via the JDBC protocol or via BigQuery APIs (read-only).
  • A data product must define its contract – data update frequency, quality levels, information classification, access policies, and usage restrictions.
  • The data product must publish its metadata and documentation in a marketplace – in the absence of an existing system, Premium Offices decides to document its first data products in a dedicated space on its company’s wiki.

This initial set of rules will of course evolve, but it sets a pragmatic framework to ensure the DATSIS characteristics of data products by exclusively leveraging existing technologies and skills. For its pilot, Premium Offices has chosen to decompose the architecture into two data products:

  • Tenancy Analytics – This first data product offers analytical capabilities on lease contracts – entity, parent company, property location, lease start date, lease end date, lease type, rent amount, etc. It is modeled in the form of a small star schema allowing analysis along 2 dimensions: time and tenant – these are the analysis dimensions needed to build the first version of the dashboard. It also includes one or two views that leverage the star schema to provide pre-aggregated data – these views constitute the public interface of the data product. Finally, it includes a view to obtain the most recent list of tenants.
  • Entity Ratings – This second data product provides historical ratings of entities in the form of a simple dataset and a mirror view to serve as an interface, in agreement with common rules. The rating is obtained from a specialized provider, which distributes them in the form of APIs. To invoke this API, a list of entities must be provided, obtained by consuming the appropriate interface of the Tenancy analytics product.

In conclusion, adopting the mindset of treating data as a product is essential for organizations undergoing data management decentralization. This approach cultivates a culture of accountability, standardization, and efficiency in handling data across different domains. By viewing data as a valuable asset and implementing structured management frameworks, organizations can ensure consistency, reliability, and seamless integration of data throughout their operations.

In our final article, we will go over the fourth and last principle of data mesh: federated computational governance.

The Practical Guide to Data Mesh: Setting up and Supervising an Enterprise-Wide Data Mesh

Written by Guillaume Bodet, our guide was designed to arm you with practical strategies for implementing data mesh in your organization, helping you:

  • Start your data mesh journey with a focused pilot project.
  • Discover efficient methods for scaling up your data mesh.
  • Acknowledge the pivotal role an internal marketplace plays in facilitating the effective consumption of data products.
  • Learn how the Actian Data Intelligence Platform emerges as a robust supervision system, orchestrating an enterprise-wide data mesh.

Get the eBook.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

The Journey to Data Mesh – Part 2 – Building a Team & Data Platform

Actian Corporation

April 15, 2024

Young Stock Trader Shows to the Executive Managers Cryptocurrency and Trade Market Correlation Pointing at the Wall TV

While the literature on data mesh is extensive, it often describes a final state, rarely how to achieve it in practice. The question then arises:

What approach should be adopted to transform data management and implement a data mesh?

In this series of articles, get an excerpt from our Practical Guide to Data Mesh where we propose an approach to kick off a data mesh journey in your organization, structured around the four principles of data mesh (domain-oriented decentralized data ownership and architecture, data as a product, self-serve data infrastructure as a platform, and federated computational governance) and leveraging existing human and technological resources.

Throughout this series of articles, and in order to illustrate this approach for building the foundations of a successful data mesh, we will rely on an example: that of the fictional company Premium Offices – a commercial real estate company whose business involves acquiring properties to lease to businesses.

In the previous article, we discussed the essential prerequisites for defining the scope of your data management decentralization pilot project, by identifying domains and selecting a use case. In this article, we will explain how to establish its development team and data platform.

Building the Pilot Development Team

As mentioned, the first step in our approach is to identify an initial use case and, more importantly, to develop it by implementing the 4 principles of data mesh with existing resources. Forming the team responsible for developing the pilot project will help implement the first principle of data mesh, domain-oriented decentralized data ownership.

PREMIUM OFFICES EXAMPLE

The data required for the pilot belongs to the Brokerage domain, where the team responsible for developing the pilot will be created. This multidisciplinary team includes:

  • A Data Product Owner
    • Should have both a good understanding of the business and a strong data culture to fulfill the following responsibilities: designing data products and managing their lifecycle, defining and enforcing usage policies, ensuring compliance with internal standards and regulations, and measuring and overseeing the economic performance and compliance of their product portfolio.
  • Two Engineers
    • One from the Brokerage domain teams – bringing knowledge of operational systems and domain software engineering practices, and the other from the data team – familiar with DBT, GCP, and BigQuery.
  • A visualization developer
    • Who can design and build the dashboard.

Domain Tooling: The Data Platform of the Data Mesh

One of the main barriers to decentralization is the risk of multiplying the efforts and skills required to operate pipelines and infrastructures in each domain. But in this regard, there is also a solid state-of-the-art inherited from distributed architectures.

The solution is to structure a team responsible for providing domains with the technological primitives and tools needed to extract, process, store, and serve data from their domain.

This model has existed for several years for application infrastructures and has gradually become generalized and automated through virtualization, containerization, DevOps tools, and cloud platforms. Although data infrastructure tooling is not as mature as software infrastructure, especially in terms of automation, most solutions are transferable, and capabilities are already present in organizations as a result of past investments. Therefore, nothing is preventing the establishment of a data infrastructure team, setting its roadmap, and gradually improving its service offering: simplification and automation being the main axes of this progression.

The Three Planes of the Data Mesh Platform

The data platform for data mesh covers a wide range of capabilities, broader than infrastructure services. This platform is divided into three planes:

  1. The Data infrastructure provisioning plane – Provides low-level services to allocate the physical resources needed for big data extraction, processing, storage, real-time or non-distributed distribution, encryption, caching, access control, network, co-location, etc.
  2. The Data product developer experience plane – Provides the tools needed to develop data products: declaration of data products, continuous build and deployment, testing, quality controls, monitoring, securing, etc. The idea is to provide abstractions above the infrastructure to hide its complexity and automate the conventions adopted on the mesh scale.
  3. The Data mesh supervision plane – Provides a set of global capabilities for discovering data products, lineage, governance, compliance, global reporting, policy control, etc.

On the infrastructure side, the data mesh does not require new capabilities – the vast majority of organizations already have a data platform. The implementation of the data mesh also does not require a centralized platform. Some companies have already invested in a common platform, and it seems logical to leverage the capabilities of this platform to develop the mesh.But others have several platforms, some entities, or certain domains having their infrastructure. It is entirely possible to deploy the data mesh on these hybrid infrastructures: as long as the data products respect common standards for addressability, interoperability, and access control, the technical modalities of their execution are of little importance.

PREMIUM OFFICES EXAMPLE

Premium Offices has invested in a shared cloud platform – specifically, GCP (Google Cloud Platform). The platform includes experts in a central team who understand its intricacies. For its pilot project, Premium Offices simply chose to integrate one of these experts into the project team. This individual will be responsible for finding solutions to automate the deployment of data products as much as possible and identifying manual steps that can be automated later, as well as any missing tools

In conclusion, establishing a dedicated development team is essential for the success of your data management decentralization pilot project. By bringing together individuals with diverse skills and expertise, organizations can effectively implement the principles of data mesh and drive meaningful insights from their data. Moreover, leveraging existing platforms and investing in automation facilitates the development process, paving the way for scalability and long-term success.

In our next article, learn how to execute your data mesh pilot project through the design and development of your first data products.

The Practical Guide to Data Mesh: Setting up and Supervising an Enterprise-Wide Data Mesh

Written by Guillaume Bodet, our guide was designed to arm you with practical strategies for implementing data mesh in your organization, helping you:

  • Start your data mesh journey with a focused pilot project.
  • Discover efficient methods for scaling up your data mesh.
  • Acknowledge the pivotal role an internal marketplace plays in facilitating the effective consumption of data products.
  • Learn how the Actian Data Intelligence Platform emerges as a robust supervision system, orchestrating an enterprise-wide data mesh.
actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Integration

Actian Data Platform Receives Data Breakthrough Award

Actian Corporation

April 11, 2024

Data Breakthrough Award

Data integration is a critical capability for any organization looking to connect their data—in an era when there’s more data from more sources than ever before. In fact, data integration is the key to unlocking and sustaining business growth. A modern approach to data integration elevates analytics and enables richer, more contextual insights by bringing together large data sets from new and existing sources.

That’s why you need a data platform that makes integration easy. And the Actian Data Platform does exactly that. It’s why the platform was recently honored with the prestigious “Data Integration Solution of the Year” award from Data Breakthrough. The Data Breakthrough Aware program recognizes the top companies, technologies, and products in the global data technology market.

Whether you want to connect data from cloud-based sources or use data that’s on-premises, the integration process should be simple, even for those without advanced coding or data engineering skill sets. Ease of integration allows business analysts, other data users, and data-driven applications to quickly access the data they need, which reduces time to value and promotes a data-driven culture.

Access the Autonomy of Self-Service Data Integration

Being recognized by Data Breakthrough, an independent market intelligence organization, at its 5th annual awards program highlights the platform’s innovative capabilities for data integration and our comprehensive approach to data management. With the platform’s modern API-first integration capabilities, organizations in any industry can connect and leverage data from diverse sources to build a more cohesive and efficient data ecosystem.

The platform provides a unified experience for ingesting, transforming, analyzing, and storing data. It meets the demands of your modern business, whether you operate across cloud, on-premises, or in hybrid environments, while giving you full confidence in your data.

With the platform, you can leverage a self-service data integration solution that addresses multiple use cases without requiring multiple products—one of the benefits that Data Breakthrough called out when giving us the award. The platform makes data easy to use for analysts and others across your organization, allowing you to unlock the full value of your data.

Making Data Integration Easy

Actian Data Platform offers integration as a service while making data integration, data quality, and data preparation easier than you may have ever thought possible. The recently enhanced platform also assists in lowering costs and actively contributes to better decision-making across the business.

The platform is unique in its ability to collect, manage, and analyze data in real time with its transactional database, data integration, data quality, and data warehouse capabilities. It manages data from any public cloud, multi or hybrid cloud, and on-premises environments through a single pane of glass.

All of this innovation will be increasingly needed as more organizations—more than 75% of enterprises by 2025—will have their data in data centers across multiple cloud providers and on-premises. Having data in various places requires a strategic investment in data management products that can span multiple locations and have the ability to bring the data together.

This is another area where the Actian Data Platform delivers value. It lets you connect data from all your sources and from any environment to break through data silos and streamline data workflows, making trusted data more accessible for all users and applications.

Try the Award-Winning Platform With a Guided Experience

Actian Data Platform also enables you to prep your data to ensure it’s ready for AI and also help you use your data to train AI models effectively. The platform can automate time-consuming data preparation tasks, such as aggregating data, handling missing values, and standardizing data from various sources.

One of our platform’s greatest strengths is its extreme performance. It offers a nine times faster speed advantage and 16 times better cost savings over alternative platforms. We’ve also made recent updates to improve user friendliness. In addition to using pre-built connectors, you can easily connect data and applications using REST- and SOAP-based APIs that can be configured with just a few clicks.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Management

Transform Your Approach to Data: How Cloud Migration Drives Modernization

Actian Corporation

April 9, 2024

Transform your approach to data in this data cloud

Whether you’re looking for cloud migration or have already completed the move, you’re probably looking to the cloud as a way to modernize your approach to data and analytics. It’s a common reason to migrate—many organizations view their cloud journey as a prime opportunity to implement new technologies that support their business and IT goals.

Like data-driven companies across all industries, you’re undoubtedly experiencing an explosion in data volumes while the number of data sources is also quickly expanding. That’s why you need the ability to immediately scale to manage large data volumes and handle changing workload demands. The cloud meets these requirements.

If you’re curious about how your peers plan to leverage the cloud for data and analytics, the technologies they’re using, and how they’re choosing cloud vendors, our survey results deliver the insights. The results, gleaned from 450 businesses, are featured in our eBook “The Top Data and Analytics Capabilities Every Modern Business Should Have.” We’ll cover some of the highlights in this blog.

Turn Your Migration into a Modernization Opportunity

As organizations realize the myriad of benefits that come with digital transformation, they want to accelerate their use of digital tools and implement technologies that advance their modernization goals. This naturally leads to company mandates to move data and analytics technologies from on-premises environments to cloud or hybrid infrastructures. Cloud migration enables instant scalability, business agility, more efficient operations, and other benefits.

Some of the top use cases in the cloud, according to our eBook, are:

  • Financial risk management at 48%.
  • Supply chain and inventory optimization at 46%.
  • Product 360 and product analytics at 43%.

The shift to the cloud for use cases and data optimization is not surprising. Organizations like yours need the ability to ingest and manage large data sets and get answers fast—even in real-time—using modern data and analytics capabilities. The cloud supports this. And if you want to optimize on-premises investments, you can leverage a hybrid data platform that spans the cloud and on-prem.

For example, the Actian Data Platform offers ease of use, scalability, and robust data integration and management capabilities across any environment, including hybrid and multi-cloud. As the need for data simplicity grows, with more data sources and analytics use cases constantly emerging, you need a data platform that enables users at all skill levels to access and utilize data without IT bottlenecks. Empowering more business analysts and others with data accelerates the speed and efficiency of business decisions, operations, and data-driven processes.

Make a Deliberate Decision When Migrating

Selecting a cloud provider for your organization involves careful considerations that go beyond the technology. For instance, it’s essential to look at deployment flexibility, support, and cost-effectiveness, among other factors. You also need to be aware of potential challenges, such as vendor lock-in and unexpected costs.

Likewise, the right cloud migration strategy is critical. Lift-and-shift migrations are common because they’re usually fast, but they can have a downside too—they may not leverage the full potential of cloud capabilities, and they can move on-prem problems with data to the cloud. Your business may benefit from a cloud-native approach, which can offer resiliency and cost-effectiveness.

According to our eBook, the top reasons companies transition to the cloud are:

  • 57% want to make managing data privacy, security, and compliance easier.
  • 51% want to improve scalability and performance, and remove capacity constraints.
  • 48% view the cloud as a better option for applications.

The eBook also notes that 72% of companies are using cloud platforms for all new analytics projects. 47% of organizations are taking a lift-and-shift approach to migrating data and analytics capabilities to the cloud, and 40% are evaluating each project independently to decide how to migrate.

Embrace Modern Data Capabilities

Having access to data in real-time is becoming essential for making the most informed decisions, identifying trends, and predicting problems before they happen, among other benefits. A modern cloud data platform like the one from Actian can deliver real-time data to the analysts and applications that need it.

The Actian Data Platform offers real-time data capabilities, superior price performance, and data integration capabilities that make it an ideal choice for businesses. You can use it to support and advance the data processes you need. As highlighted in our eBook, the top cloud technologies organizations are using include:

  • Cloud data security and data privacy at 54%.
  • Cloud data integration and data operations also at 54%.
  • Cloud data quality and data mastering at 52%.
  • Cloud data streaming and real-time analytics at 52%.

To find out more about the data and analytics capabilities your peers want in the cloud, the challenges they’re experiencing, what they look for in a cloud vendor, and other insights, “The Top Data and Analytics Capabilities Every Modern Business Should Have” help you get more value from your cloud journey.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

The Journey to Data Mesh – Part 1 – Scoping Your Pilot Project

Actian Corporation

April 9, 2024

While the literature on data mesh is extensive, it often describes a final state, rarely how to achieve it in practice. The question then arises:

What approach should be adopted to transform data management and implement a data mesh?

In this series of articles, get an excerpt from our Practical Guide to Data Mesh where we propose an approach to kick off a data mesh journey in your organization, structured around the four principles of data mesh (domain-oriented decentralized data ownership and architecture, data as a product, self-serve data infrastructure as a platform, and federated computational governance) and leveraging existing human and technological resources.

Throughout this series of articles, and in order to illustrate this approach for building the foundations of a successful data mesh, we will rely on an example: that of the fictional company Premium Offices – a commercial real estate company whose business involves acquiring properties to lease to businesses.

The initial step to transforming data management and implementing data mesh within your organization involves building a pilot project, an embryo of mesh. This will be developed based on the 4 principles of data mesh, using existing resources, meaning without impacting the organization.

To ensure a successful start to your data management decentralization journey, you must focus on two essential prerequisites: well-defined domains and the choice of an initial use case.

Domain Identification

The primary prerequisite for launching the pilot project is the identification of domains – the federation of autonomous domains being at the core of the data mesh.

This step generally poses no difficulty. Indeed, the concept of domains is already widely understood, and the division into domains is often stable – whether structured according to value chains, major business processes, or organizational operational capabilities. Domains sometimes have their own technical teams and operational systems that generate the majority of the data. The transition often involves reallocating data ownership according to an existing structure.

PREMIUM OFFICES EXAMPLE

Premium Offices is already structured around domains that reflect its major capabilities. Here are three examples of domains:

  • Asset
    • A domain responsible for acquiring and managing real estate assets. It primarily relies on asset management software.
  • Brokerage
    • A domain that manages the commercialization of properties for rent and tenant management. It utilizes Tenant Management software and is responsible for the commercial website and posting offers on specialized marketplaces.
  • Capital Markets
    • A domain responsible for loans to finance purchases and optimize the loan portfolio. It uses another specialized software.

Premium Offices already has a modern data platform, based on DBT, Google BigQuery, and Tableau. It is managed by a centralized team supported by a centralized Data Office.

Organisation En Domaines Chez Premium Offices

Organization by domains – From the main capacities of Premium Offices

Choosing an Initial Use Case

The choice of a use case for the pilot project is relatively arbitrary – it could involve revamping an existing dashboard, creating a new dashboard, adding AI capabilities to an application, or even commercializing certain data. However, this first use case must possess specific characteristics to facilitate optimal learning conditions:

  • It must focus on usage, not just one or more data products – the intrinsic value of a data product is null, and its value is realized through its uses.
  • It should not be overly cross-cutting and should consume data from one or two domains at most – ideally, just one.
  • It should not be overly simplistic and should consume more than one data product; two or three are sufficient.
  • It should not be overly experimental – the goal is to achieve concrete results quickly.

PREMIUM OFFICES EXAMPLE

For the pilot project, Premium Offices has chosen to build a credit risk dashboard for its tenants to better anticipate and prevent potential defaults. This dashboard must combine tenant data from its software and credit data acquired from a specialized provider. These data are already used operationally in the process of evaluating a new tenant.

In conclusion, initiating a data mesh transformation and launching a pilot project begins with key prerequisites: identifying domains and choosing an initial use case. By defining a scope upfront, organizations can lay a solid foundation for decentralized data management, all without impacting the organization.

In the next article, we delve into the establishment of a development team and a robust data platform to support the data mesh pilot project.

The Practical Guide to Data Mesh: Setting up and Supervising an Enterprise-Wide Data Mesh

Written by Guillaume Bodet, our guide was designed to arm you with practical strategies for implementing data mesh in your organization, helping you:

  • Start your data mesh journey with a focused pilot project.
  • Discover efficient methods for scaling up your data mesh.
  • Acknowledge the pivotal role an internal marketplace plays in facilitating the effective consumption of data products.
  • Learn how the Actian Data Intelligence Platform emerges as a robust supervision system, orchestrating an enterprise-wide data mesh.
actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Architecture

Strategies for Midsize Enterprises to Overcome Cloud Adoption

Dee Radh

March 22, 2024

Hybrid Cloud Strategies

While moving to the cloud is transformative for businesses, the reality is that midsize enterprise CIOs and CDOs must consider a number of challenges associated with cloud adoption. Here are the three most pressing challenges we hear about – and how you can work to solve them.

  • Leveraging existing data infrastructure investments.
  • Closing technical skills gap.
  • Cloud cost visibility and control.

Recommendations

  • Innovate with secure hybrid cloud solutions.
  • Choose managed services that align with the technical ability of your data team.
  • Maintain cost control with a more streamlined data stack.

Innovate With Secure Hybrid Cloud Solutions for Cloud Adoption

There is no denying that cloud adoption is cheaper in the long run. The elimination of CapExcosts enables CIOs to allocate resources strategically, enhance financial predictability, and align IT spending with business goals. This shift toward OpEx-based models is integral to modernizing IT operations and supporting organizational growth and agility in today’s digital economy.

Data pyramid on the data cloud in 2028

But migrating all workloads to the cloud in a single step carries inherent risks including potential disruptions. Moreover, companies with strict data sovereignty requirements or regulatory obligations may need to retain certain data on-premises due to legal, security, or privacy considerations. Hybrid cloud mitigates these risks by enabling companies to migrate gradually, validate deployments, and address issues iteratively, without impacting critical business operations. It offers a pragmatic approach for midsize enterprises seeking to migrate to the cloud while leveraging their existing data infrastructure investments.

How Actian Hybrid Data Integration Can Help

The Actian Data Platform combines the benefits of on-premises infrastructure with the scalability and elasticity of the cloud for analytic workloads. Facilitating seamless integration between on-premises data sources and the cloud data warehouse, the platform enables companies to build hybrid cloud data pipelines that span both environments. This integration simplifies data movement, storage and analysis, enabling organizations to extend the lifespan of existing assets and deliver a cohesive, unified and resilient data infrastructure.

Choose Managed Services that Align With the Technical Ability of Your Data Team

Cloud adoption brings an array of new opportunities to the table, but the cloud skills gap remains a problem. High demand means there’s fierce market competition for skilled technical workers. Midsize enterprises across industries and geos are struggling to hire and retain top talent in the areas of cloud architecture, operations, security, and governance, which in turn severely delays their cloud adoption, migration, and maturity. This carries the potential greater risk of falling behind competitors.

Data Analytics on cloud skills

Bridging this skills gap requires strategic investments in HR and Learning and Development (L&D), but the long-term solution has to go simply beyond upskilling employees. One such answer is managed services that are typically low- or no-code, thus enabling even non-IT users to automate key BI, reporting, and analytic workloads with proper oversight and accountability. Managed solutions are typically designed to handle large volumes of data and scale seamlessly as data volumes grow—perfect for midsize enterprises. They often leverage distributed processing frameworks and cloud infrastructure to ensure high performance and reliability, even with complex data pipelines.

Actian’s Low-Code Solutions

The Actian Data Platform was built for collaboration and governance midsize enterprises demand. The platform comes with more than 200 fully managed pre-built connectors to popular data sources such as databases, cloud storage, APIs, and applications. These connectors eliminate the need for manual coding to interact with different data systems, speeding up the integration process and reducing the likelihood of errors. The platform also includes built-in tools for data transformation, cleansing, and enrichment. Citizen integrators and business analysts can apply various transformations to the data as it flows through the pipeline, such as filtering, aggregating and cleansing, ensuring data quality and reliability—all without code.

Maintain Cost Control With a More Streamlined Data Stack

Midsize enterprises are rethinking their data landscape to reduce cloud modernization complexity and drive clear accountability for costs across their technology stack. This complexity arises due to various factors, including the need to refactor legacy applications, integrate with existing on-premises systems, manage hybrid cloud environments, address security and compliance requirements, and ensure minimal disruption to business operations.

Point solutions, while helpful for specific problems, can lead to increased operational overhead, reduced data quality, and potential points of failure, increasing the risk of data breaches and regulatory violations. Although the cost of entry is low, the ongoing support, maintenance, and interoperability cost of these solutions are almost always high.

Data Analytics on Top Cloud Challenges

A successful journey to cloud requires organizations to adopt a more holistic approach to data management, with a focus on leveraging data across the entire organization’s ecosystem. Data platforms can simplify data infrastructure, thus enabling organizations to migrate and modernize their data systems faster and more effectively in cloud-native environments all while reducing licensing costs and streamlining maintenance and support.

How Actian’s Unified Platform Can Help

The Actian Data Platform can unlock the full potential of the cloud and offers several advantages over multiple point solutions with its centralized and unified environment for managing all aspects of the data journey from collection through to analysis. The platform reduces the learning curve for users, enabling them to derive greater value from their data assets while reducing complexity, improving governance, and driving efficiency and cost savings.

Getting Started

Book a demo to see how Actian can accelerate your journey to the cloud in a governed, scalable, and price-performant way.

dee radh headshot

About Dee Radh

As Senior Director of Product Marketing, Dee Radh heads product marketing for Actian. Prior to that, she held senior PMM roles at Talend and Formstack. Dee has spent 100% of her career bringing technology products to market. Her expertise lies in developing strategic narratives and differentiated positioning for GTM effectiveness. In addition to a post-graduate diploma from the University of Toronto, Dee has obtained certifications from Pragmatic Institute, Product Marketing Alliance, and Reforge. Dee is based out of Toronto, Canada.
Data Management

Cloud Data Migration: Migrate Your Mission-Critical Database

Teresa Wingfield

March 20, 2024

Cloud Data Migration on a Critical Mission

The Path to Successful Cloud Data Migration

Is your company contemplating moving its mission-critical database through cloud data migration? If so, you may have concerns around the cloud’s ability to provide the performance, security, and privacy required to adequately support your database applications. Fortunately, it’s a new day in cloud computing that allows you to migrate to the cloud with confidence! Here are some things to keep in mind that will bring you peace of mind for cloud migration.

Optimized Performance

You may enjoy faster database performance through cloud data migration. Cloud service providers (CSPs) offer varying processing power, memory, and storage capacity options to meet your most demanding workload performance requirements. Frequently accessed data can be stored in high-speed caches closer to users, minimizing latency and improving response times. Load balancers distribute processing across servers within the cloud infrastructure to prevent server overload and bottlenecks. Some CSPs also have sophisticated monitoring tools to track resource usage and identify performance bottlenecks.

Enhanced Security

Data isn’t necessarily more secure in your on-premises data center than in the cloud. This is because CSPs invest heavily in advanced security controls to protect their infrastructure and have deep security expertise. They constantly update and patch their systems, often addressing vulnerabilities faster than on-premises deployments. Some CSPs also offer free vulnerability scanning and penetration testing.

However, it’s important to keep in mind that you are also responsible for security in the cloud. The Shared Responsibility Model (SRM) is a cloud security approach that states that CSPs are responsible for securing their service infrastructure and customers are responsible for securing their data and applications within the cloud environment. This includes tasks such as:

  • Patching and updating software.
  • Properly configuring security settings.
  • Implementing adequate access controls.
  • Managing user accounts and permissions.

Improved Compliance

Organizations with strict data privacy requirements have understandably been reluctant to operate their mission-critical databases with sensitive data in the cloud. But with the right CSP and the right approach, it is possible to implement a compliant cloud strategy. CSPs offer infrastructure and services built to comply with a wide range of global security and compliance standards such as GDPR, PCI DSS, HIPAA, and others, including data sovereignty requirements:

Data Residency Requirements

You can choose among data center locations for where to store your data to meet compliance mandates. Some CSPs can prevent data copies from being moved outside of a location.

Data Transfer Requirements

These include the legal and regulatory rules that oversee how personal data can be moved across different jurisdictions, organizations, or systems. CSPs often offer pre-approved standard contractual clauses (SCCs) and support Binding Corporate Rules (BCRs) to serve compliance purposes for data transfers. Some CSPs let their customers control and monitor their cross-border data transfers.

Sovereign Controls

Some CSPs use hardware-based enclaves to ensure complete data isolation.

Additionally, many CSPs, as well as database vendors, offer features to help customers with compliance requirements to protect sensitive data. These include:

  • Data encryption at rest and in transit protects data from unauthorized access.
  • Access controls enforce who can access and modify personal data.
  • Data masking and anonymization de-identify data while still allowing analysis.
  • Audit logging: tracks data access and activity for improved accountability.

Microsoft Cloud for Sovereignty provides additional layers of protection through features like Azure Confidential Computing. This technology utilizes hardware-based enclaves to ensure even Microsoft cannot access customer data in use.

Cloud Data Migration Made Easy

Ingres NeXt delivers low-risk database migration from traditional environments to modern cloud platforms with web and mobile client endpoints. Since no two journeys to the cloud are identical, Actian provides the infrastructure and tooling required to take customers to the cloud regardless of what their planned journey may look like.

Here are additional articles on database modernization benefits and challenges that you may find helpful:

teresa user avatar

About Teresa Wingfield

Teresa Wingfield is Director of Product Marketing at Actian, driving awareness of the Actian Data Platform's integration, management, and analytics capabilities. She brings 20+ years in analytics, security, and cloud solutions marketing at industry leaders such as Cisco, McAfee, and VMware. Teresa focuses on helping customers achieve new levels of innovation and revenue with data. On the Actian blog, Teresa highlights the value of analytics-driven solutions in multiple verticals. Check her posts for real-world transformation stories.
AI & ML

How to Effectively Prepare Your Data for GenAI

Actian Corporation

March 20, 2024

Preparing your Data using Generative AI

Many organizations are prioritizing the deployment of Generative AI for a number of mission-critical use cases. This isn’t surprising. Everyone seems to be talking about GenAI, with some companies now moving forward with various applications.

While company leaders may be ready to unleash the power of GenAI, their data may not be as ready. That’s because a lack of proper data preparation is setting up many organizations for costly and time-consuming setbacks.

However, when approached correctly, proper data prep can help accelerate and enhance GenAI deployments. That’s why preparing data for GenAI is essential, just like for other analytics, to avoid the “garbage in, garbage out” principle and to prevent skewed results.

As Actian shared in our presentation at the recent Gartner Data & Analytics Summit, there are both promises and pitfalls when it comes to GenAI. That’s why you need to be skeptical about the hype and make sure your data is ready to deliver the GenAI results you’re expecting.

Data Prep is Step One

We noted in our recent news release that comprehensive data preparation is the key to ensuring generative AI applications can do their job effectively and deliver trustworthy results. This is supported by the Gartner “Hype Cycle for Artificial Intelligence, 2023” that says, “Quality data is crucial for generative AI to perform well on specific tasks.”

In addition, Gartner explains that “Many enterprises attempt to tackle AI without considering AI-specific data management issues. The importance of data management in AI is often underestimated, so data management solutions are now being adjusted for AI needs.”

A lack of adequately prepared data is certainly not a new issue. For example, 70% of digital transformation projects fail because of hidden challenges that organizations haven’t thought through, according to McKinsey. This is proving true for GenAI too—there are a range of challenges many organizations are not thinking about in their rush to deploy a GenAI solution. One challenge is data quality, which must be addressed before making data available for GenAI use cases.

What a New Survey Reveals About GenAI Readiness

To gain insights into companies’ readiness for GenAI, Actian commissioned research that surveyed 550 organizations in seven countries—70% of respondents were director level or higher. The survey found that GenAI is being increasingly used for mission-critical use cases:

  • 44% of survey respondents are implementing GenAI applications today.
  • 24% are just starting and will be implementing it soon.
  • 30% are in the planning or consideration stage.

The majority of respondents trust GenAI outcomes:

  • 75% say they have a good deal or high degree of trust in the outcomes.
  • 5% say they do not have very much or not much trust in them.

It’s important to note that 75% of those who trust GenAI outcomes developed that trust based on their use of other GenAI solutions such as ChatGPT rather than their own deployments. This level of undeserved trust has the potential to lead to problems because users do not fully understand the risk that poor data quality poses to GenAI outcomes in business.

It’s one issue if ChatGPT makes a typo. It’s quite another issue if business users are turning to GenAI to write code, audit financial reports, create designs for physical products, or deliver after-visit summaries for patients—these high value use cases do not have a margin for error. It’s not surprising, therefore, that our survey found that 87% of respondents agree that data prep is very or extremely important to GenAI outcomes.

Use Our Checklist to Ensure Data Readiness

While organizations may have a high degree of confidence in GenAI, the reality is that their data may not be as ready as they think. As Deloitte notes in “The State of Generative AI in the Enterprise,” organizations may become less confident over time as they gain experience with the larger challenges of deploying generative AI at scale. “In other words, the more they know, the more they might realize how much they don’t know,” according to Deloitte.

This could be why only four percent of people in charge of data readiness say they were ready for GenAI, according to Gartner’s “We Shape AI, AI Shapes Us: 2023 IT Symposium/Xpo Keynote Insights.” At Actian, we realize there’s a lot of competitive pressure to implement GenAI now, which can prompt organizations to launch it without thinking through data and approaches carefully.

In our experience at Actian, there are many hidden risks related to navigating and achieving desired outcomes for GenAI. Addressing these risks requires you to:

  • Ensure data quality and cleanliness.
  • Monitor the accuracy of training data and machine learning optimization.
  • Identify shifting data sets along with changing use case and business requirements over time.
  • Map and integrate data from outside sources, and bring in unstructured data.
  • Maintain compliance with privacy laws and security issues.
  • Address the human learning curve.

Actian can help your organization get your data ready to optimize GenAI outcomes. We have a “GenAI Data Readiness Checklist” that includes the results of our survey and also a strategic checklist to get your data prepped. You can also contact us and then our experts will help you find the fastest path to the GenAI deployment that’s right for your business.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

5 Reasons Why Organizations are Shifting Towards Data Mesh

Actian Corporation

March 19, 2024

data mesh red and orange grid

Data has become the lifeblood of organizations, driving innovation, efficiency, and competitiveness. Yet, amidst the exponential growth of data, organizations grapple with the challenge of deriving meaningful insights and value from their data assets. The complexities of data management exacerbate this struggle, as traditional approaches–which are often centralized and monolithic and anchored in a data lake or warehouse–fall short in addressing the diverse needs and dynamic nature of modern data ecosystems.

Against this backdrop, a new paradigm is emerging: data mesh. This innovative approach to data management represents a departure from traditional centralized models, offering a decentralized framework that empowers organizations to harness the full potential of their data assets.

In this article, we delve into the factors explaining the enthusiasm for data mesh and the decentralization of data management: those related to economic pressure and competitiveness weighing on organizations, and others related to the very birth of data mesh itself.

Reason 1: Economic Pressures

Executive teams are under increasing pressure to justify their investments in data infrastructure and management. Despite substantial resources allocated to these initiatives over the past decade, measuring tangible economic returns remains a significant challenge.

This frustration stems from the inability to correlate data investments with concrete financial outcomes, leading to uncertainty and dissatisfaction among stakeholders.

Reason 2: Competitiveness and the Impact of AI

The fear of losing competitiveness due to the inability to take advantage of the rapidly democratizing opportunities offered by artificial intelligence also creates frustration. Until recently, developing AI models was a long and risky process with uncertain outcomes. The rapid development of highly performant, inexpensive, and easy-to-integrate off-the-shelf models has changed the game entirely.

It’s now possible to prototype an AI application in a few days by adjusting and combining shelf models. However, scaling requires feeding these models with data that is of quality, traceable, secure, compliant, etc. In short, well-managed data adds additional pressure on centralized data management teams.

Reason 3: Flexibility in Implementation

Other factors are more directly related to the nature of the data mesh itself: it is not an architecture, a language, a method, or even a technology – all of which are often complex, controversial, and divisive subjects.

Data mesh simply lays out a few easy-to-understand principles, and these principles aren’t prescriptive – they can be implemented in a thousand different ways.

Reason 4: Endorsement and Enthusiasm

Data mesh principles are also not purely academic: they transpose to the world of analytical data the practices that allowed large software organizations to master the complexity of their systems while continuing to innovate at a rapid pace. Data mesh is based on strong theoretical and empirical foundations – it’s very hard to resist the argumentation developed by Dehghani.

It has the rare quality of easily gaining the support, even enthusiasm, of data teams, including at the decision-making level. This unanimity limits resistance to change, ensures strong sponsorship, and partly explains the speed of its adoption worldwide.

Reason 5: Accessibility and Cost-Effectiveness

The principles of data mesh are easy to implement, without significant investments, simply by reallocating existing resources. When transforming a monolithic software platform into a plethora of loosely coupled and tightly integrated distributed services, one knows that the operation will be lengthy, costly, and risky.

For data, the situation is very different. Data is already, by nature, distributed. And all organizations already have the necessary technologies to extract, process, store, and consume their data in higher-level applications. Implementing the basics of data mesh primarily involves transforming an organization and practices, not making new massive technological investments.

In their eagerness to reform their practices, data leaders have found in data mesh a convincing and accessible framework and have massively included it in their strategic roadmap. It goes without saying, however, that the transition from a centralized data management model to an operational data mesh can only be done gradually – there is no magic wand. And each organization begins this transition in its own context – its strategic challenges, personnel, organization, processes, culture, or even its technological stack.

The Practical Guide to Data Mesh: Setting up and Supervising an Enterprise-Wide Data Mesh

Written by Guillaume Bodet, our guide was designed to arm you with practical strategies for implementing data mesh in your organization, helping you:

  • Start your data mesh journey with a focused pilot project, leveraging an initial use case.
  • Discover efficient methods for scaling up your data mesh, enhancing the creation of data products.
  • Acknowledge the pivotal role an internal marketplace plays in facilitating the effective consumption of data products.
  • Learn how the Actian Data Intelligence Platform emerges as a robust supervision system, orchestrating an enterprise-wide data mesh.

Get the eBook here!

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Integration

Effective Data Integration and Automation in Digital Landscape

Traci Curran

March 18, 2024

data cloud using effective data integration

In today’s data-driven world, the demand for seamless data integration and automation has never been greater. Various sectors rely heavily on data and applications to drive their operations, making it crucial to have efficient methods of integrating and automating processes. However, ensuring successful implementation requires careful planning and consideration of various factors.

Data integration refers to combining data from different sources and systems into a unified, standardized view. This integration gives organizations a comprehensive and accurate understanding of their data, enabling them to make well-informed decisions. By integrating data from various systems and applications, companies can avoid inconsistencies and fragmentations often arising from siloed data. This, in turn, leads to improved efficiency and productivity across the organization.

One of the primary challenges in data integration is the complexity and high cost associated with traditional system integration methods. However, advancements in technology have led to the availability of several solutions aimed at simplifying the integration process. Whether it’s in-house development or leveraging third-party solutions, choosing the right integration approach is crucial for achieving success. IT leaders, application managers, data engineers, and data architects play vital roles in this planning process, ensuring that the chosen integration approach aligns with the organization’s goals and objectives.

Before embarking on an integration project, thorough planning and assessment are essential. Understanding the specific business problems that need to be resolved through integration is paramount. This involves identifying the stakeholders and their requirements and the anticipated benefits of the integration. Evaluating different integration options, opportunities, and limitations is also critical. Infrastructure costs, deployment, maintenance efforts, and the solution’s adaptability to future business needs should be thoroughly considered before deciding on an integration approach.

Five Foundational Areas for Initiating any Data Integration Project

Establishing the Necessity

It is essential to understand the use cases and desired business outcomes to determine the necessity for an integration solution.

Tailoring User Experience

The integration solution should provide a unique user experience tailored to all integration roles and stakeholders.

Understanding Existing Business Systems and Processes

A detailed understanding of the existing business systems, data structures, scalability, dependencies, and regulatory compliance is essential.

Assessing Available Technologies

It is important to assess the available technologies and their potential to meet the organization’s integration needs and objectives.

Data Synchronization Management

Managing data synchronization is an ongoing process that requires careful planning, ownership, management, scheduling, and control.

Effective data integration and automation are crucial for organizations to thrive in today’s data-driven world. With the increasing demand for data and applications, it is imperative to prevent inconsistencies and fragmentations. By understanding the need for integration, addressing foundational areas, and leveraging solutions like Actian, organizations can streamline their data integration processes, make informed decisions, and achieve their business objectives. Embracing the power of data integration and automation will pave the way for future success in the digital age.

A Solution for Seamless Data Integration

Actian offers a suite of solutions to address the challenges associated with integration. Their comprehensive suite of products covers the entire data journey from edge to cloud, ensuring seamless integration across platforms. The Actian platform provides the flexibility to meet diverse business needs, empowering companies to effectively overcome data integration challenges and achieve their business goals. By simplifying how individuals connect, manage, and analyze data, Actian’s data solutions facilitate data-driven decisions that accelerate business growth. The data platform integrates seamlessly, performs reliably, and delivers at industry-leading speeds.

Traci Curran headshot

About Traci Curran

Traci Curran is Director of Product Marketing at Actian, focusing on the Actian Data Platform. With 20+ years in tech marketing, Traci has led launches at startups and established enterprises like CloudBolt Software. She specializes in communicating how digital transformation and cloud technologies drive competitive advantage. Traci's articles on the Actian blog demonstrate how to leverage the Data Platform for agile innovation. Explore her posts to accelerate your data initiatives.
Data Engineering

The Data Engineering Decision Guide to Data Integration Tools

Dee Radh

March 15, 2024

A team preparing to decide which path to take for successful data integration

With organizations using an average of 130 apps, the problem of data fragmentation has become increasingly prevalent. As data production remains high, data engineers need a robust data integration strategy. A crucial part of this strategy is selecting the right data integration tool to unify siloed data.

Assessing Your Data Integration Needs

Before selecting a data integration tool, it’s crucial to understand your organization’s specific needs and data-driven initiatives, whether they involve improving customer experiences, optimizing operations, or generating insights for strategic decisions.

Understand Business Objectives

Begin by gaining a deep understanding of the organization’s business objectives and goals. This will provide context for the data integration requirements and help prioritize efforts accordingly. Collaborate with key stakeholders, including business analysts, data analysts, and decision-makers, to gather their input and requirements. Understand their data needs and use cases, including their specific data management rules, retention policies, and data privacy requirements.

Audit Data Sources

Next, identify all the sources of data within your organization. These may include databases, data lakes, cloud storage, SaaS applications, REST APIs, and even external data providers. Evaluate each data source based on factors such as data volume, data structure (structured, semi-structured, unstructured), data frequency (real-time, batch), data quality, and access methods (API, file transfer, direct database connection). Understanding the diversity of your data sources is essential in choosing a tool that can connect to and extract data from all of them.

Define Data Volume and Velocity

Consider the volume and velocity of data that your organization deals with. Are you handling terabytes of data per day, or is it just gigabytes? Determine the acceptable data latency for various use cases. Is the data streaming in real-time, or is it batch-oriented? Knowing this will help you select a tool to handle your specific data throughput.

Identify Transformation Requirements

Determine the extent of data transformation logic and preparation required to make the data usable for analytics or reporting. Some data integration tools offer extensive transformation capabilities, while others are more limited. Knowing your transformation needs will help you choose a tool that can provide a comprehensive set of transformation functions to clean, enrich, and structure data as needed.

Consider Integration With Data Warehouse and BI Tools

Consider the data warehouse, data lake, and analytical tools and platforms (e.g., BI tools, data visualization tools) that will consume the integrated data. Ensure that data pipelines are designed to support these tools seamlessly. Data engineers can establish a consistent and standardized way for analysts and line-of-business users to access and analyze data.

Choosing the Right Data Integration Approach

There are different approaches to data integration. Selecting the right one depends on your organization’s needs and existing infrastructure.

Batch vs. Real-Time Data Integration

Consider whether your organization requires batch processing or real-time data integration—they are two distinct approaches to moving and processing data. Batch processing is suitable for scenarios like historical data analysis where immediate insights are not critical and data updates can happen periodically, while real-time integration is essential for applications and use cases like Internet of Things (IoT) that demand up-to-the-minute data insights.

On-Premises vs. Cloud Integration

Determine whether your data integration needs are primarily on-premises or in the cloud. On-premises data integration involves managing data and infrastructure within an organization’s own data centers or physical facilities, whereas cloud data integration relies on cloud service providers’ infrastructure to store and process data. Some tools specialize in on-premises data integration, while others are built for the cloud or hybrid environments. Choose a tool that depends on factors such as data volume, scalability requirements, cost considerations, and data residency requirements.

Hybrid Integration

Many organizations have a hybrid infrastructure, with data both on-premises and in the cloud. Hybrid integration provides flexibility to scale resources as needed, using cloud resources for scalability while maintaining on-premises infrastructure for specific workloads. In such cases, consider a hybrid data integration and data quality tool like Actian’s DataConnect or the Actian Data Platform to seamlessly bridge both environments and ensure smooth data flow to support a variety of operational and analytical use cases.

Evaluating ETL Tool Features

As you evaluate ETL tools, consider the following features and capabilities:

Data Source and Destination Connectivity and Extensibility

Ensure that the tool can easily connect to your various data sources and destinations, including relational databases, SaaS applications, data warehouses, and data lakes. Native ETL connectors provide direct, seamless access to the latest version of data sources and destinations without the need for custom development. As data volumes grow, native connectors can often scale seamlessly, taking advantage of the underlying infrastructure’s capabilities. This ensures that data pipelines remain performant even with increasing data loads. If you have an outlier data source, look for a vendor that provides Import API, webhooks, or custom source development.

Scalability and Performance

Check if the tool can scale with your organization’s growing data needs. Performance is crucial, especially for large-scale data integration tasks. Inefficient data pipelines with high latency may result in underutilization of computational resources because systems may spend more time waiting for data than processing it. An ETL tool that supports parallel processing can handle large volumes of data efficiently. It can also scale easily to accommodate growing data needs. Data latency is a critical consideration for data engineers, because it directly impacts the timeliness, accuracy, and utility of data for analytics and decision-making.

Data Transformation Capabilities

Evaluate the tool’s data transformation capabilities to handle unique business rules. It should provide the necessary functions for cleaning, enriching, and structuring raw data to make it suitable for analysis, reporting, and other downstream applications. The specific transformations required can include: data deduplication, formatting, aggregation, normalization etc., depending on the nature of the data, the objectives of the data project, and the tools and technologies used in the data engineering pipeline.

Data Quality and Validation Capabilities

A robust monitoring and error-handling system is essential for tracking data quality over time. The tool should include data quality checks and validation mechanisms to ensure that incoming data meets predefined quality standards. This is essential for maintaining data integrity and accuracy, and it directly impacts the accuracy, reliability, and effectiveness of analytic initiatives. High quality data builds trust in analytical findings among stakeholders. When data is trustworthy, decision-makers are more likely to rely on the insights generated from analytics. Data quality is also an integral part of data governance practices.

Security and Regulatory Compliance

Ensure that the tool offers robust security features to protect your data during transit and at rest. Features such as SSH tunneling and VPNs provide encrypted communication channels, ensuring the confidentiality and integrity of data during transit. It should also help you comply with data privacy regulations, such as GDPR or HIPAA.

Ease of Use and Deployment

Consider the tool’s ease of use and deployment. A user-friendly low-code interface can boost productivity, save time, and reduce the learning curve for your team, especially for citizen integrators that can come from anywhere within the organization. A marketing manager, for example, may want to integrate web traffic, email marketing, ad platform, and customer relationship management (CRM) data into a data warehouse for attribution analysis.

Vendor Support

Assess the level of support, response times, and service-level agreements (SLAs) provided by the vendor. Do they offer comprehensive documentation, training resources, and responsive customer support? Additionally, consider the size and activity of the tool’s user community, which can be a valuable resource for troubleshooting and sharing best practices.

A fully managed hybrid solution like Actian simplifies complex data integration challenges and gives you the flexibility to adapt to evolving data integration needs.

For a comprehensive guide to evaluating and selecting the right Data Integration tool, download the ebook Data Engineering Guide: Nine Steps to Select the Right Data Integration Tool.

dee radh headshot

About Dee Radh

As Senior Director of Product Marketing, Dee Radh heads product marketing for Actian. Prior to that, she held senior PMM roles at Talend and Formstack. Dee has spent 100% of her career bringing technology products to market. Her expertise lies in developing strategic narratives and differentiated positioning for GTM effectiveness. In addition to a post-graduate diploma from the University of Toronto, Dee has obtained certifications from Pragmatic Institute, Product Marketing Alliance, and Reforge. Dee is based out of Toronto, Canada.
ESG

How to Use Business Intelligence to Support Strategic Sustainability

Actian Corporation

March 13, 2024

An analyst researching ESG strategies for sustainability

In our modern business world, where new trends, demands, and innovation can happen at lightning-fast speed, sustainability has become a top focus for executives and customers. In response, forward-thinking organizations are looking for ways to minimize their global impact, reduce carbon emissions, and implement sustainability best practices to optimize efficiency without sacrificing profitability.

As noted in Harvard Business Review (HBR), consumers are viewing sustainability as a baseline requirement when making purchases. “Our research suggests we’re on the brink of a major shift in consumption patterns, where truly sustainable brands—those that make good on their promises to people and the planet—will seize the advantage from brands that make flimsy claims or that have not invested sufficiently in sustainability.”

One effective approach for strategically supporting and measuring sustainability, including environmental, social, and governance (ESG) efforts, is to use business intelligence (BI). BI is a trusted, powerful, and proven process that transforms data into actionable insights to enable confident and informed decision-making. The insights can be applied to sustainability goals.

The Crucial Role of BI in Improving Sustainability

BI can play a pivotal role in your sustainability efforts by analyzing data related to resource usage, energy consumption, and waste across business operations, supply chains, manufacturing processes, product design and lifecycle, and other areas. BI insights can uncover patterns such as spikes in energy usage, areas most prone to waste, or process inefficiencies that create barriers to achieving sustainability goals.

The power of BI stems from its ability to perform analytics on large volumes of data from a variety of sources. This capability enables you to monitor and report on sustainability efforts while identifying areas where you can improve processes to reduce your environmental impact.

For example, a manufacturer using BI can determine that a specific process run at a certain time of the day is causing the company to consume significantly more energy. The process could potentially be altered to improve efficiency, lowering energy usage. Another example is in the transportation and logistics industry. These companies can use real-time BI to optimize their deliveries based on traffic, weather, and other factors for the fastest route possible, which can reduce carbon emissions and use less energy.

When BI is used in conjunction with data visualization tools, the insights are put into charts, graphs, or maps. This makes the insights easy to understand, even for people without a technical or analytical background. You can look at data about your organization’s waste, for example, to find out at a glance where there are opportunities for recycling or waste reduction.

BI Demands Efficient Data Management Processes

One common challenge many organizations face when leveraging BI for sustainability, or for any other use case, is managing the expansive data sets that are available. Data management is a necessity for any BI or analytics project, but many organizations lack this essential capability. It can be due to a lack of scalability, an inability to easily add data pipelines, outdated integration tools that can’t easily ingest and share data, or information stuck in silos. This limits the data that can be used for BI, which in turn can lead to inaccurate or incomplete insights.

That’s why you need modern data processing and BI capabilities. You also need to ensure that your data is accurate, reliable, and trusted in order to have full confidence in the results. A modern data management strategy is required for effective BI. The strategy should equip your organization to handle the volume, variety, and velocity of data and make it accessible and available for BI.

Data management best practices also include cleansing, enriching, and aggregating data to ensure it has the quality you need. You must also determine if your BI requires real-time or near real-time data and if so, have a platform in place to deliver data at the speed you need. Data management, BI, and sustainability are intertwined—data management provides data quality and accessibility, while BI turns the data into strategic insights to inform and refine sustainable strategies.

BI is a Key Enabler for the Future of Sustainability

If your organization is placing an increasing focus on sustainability, you’ll realize the value of BI to help with these efforts. The power of BI and the evolution of BI technology will help you better anticipate resource needs, have the insights needed to proactively minimize your environmental impact, and forecast trends that could affect sustainability. You’ll also have the insights needed to align your business goals with ESG objectives.

BI is a powerful tool in your arsenal to implement and continually improve sustainability practices. With detailed and accurate data analysis, along with the ability to drill down into issues for granular details, you can identify new opportunities to drive efficiencies, make better use of your resources, and take meaningful actions that reduce your environmental footprint.

Moving forward, integrating BI processes into business sustainability strategies will become more common and sophisticated—and more necessary. BI is positioned to play an essential role in enabling data-driven decisions that promote ESG without compromising business performance. In fact, BI can help you strike an acceptable balance that encourages growth for both sustainability and the business.

Managing data and embracing BI are two steps needed to become more sustainable in our increasingly eco-conscious world. Likewise, data and BI can be instrumental in identifying areas that can benefit from increased efficiencies, pinpointing resources that are being underutilized, and determining where sustainability efforts can make the most impact.

Actian Offers the Ideal Platform to Support Sustainability

With consumers, business partners, and business stakeholders placing a strong emphasis on sustainability and ESG responsibility, BI stands out as a proven tool to guide you toward sustainable practices while also boosting the bottom line. Actian can help you integrate and manage your data for BI and analytics.

Our high-performance technologies can bring together large volumes of data for analysis. We can integrate data from various sources, including internet of things (IoT) devices, supply chains, manufacturing processes, and energy metrics for a comprehensive view of your ESG posture.

The scalable Actian Data Platform makes data easy to integrate, manage, and analyze to support your sustainability goals, regardless of the size or complexity of your data sets. You can also use the platform for predictive modeling to determine how proposed process changes will affect sustainability.

At Actian, we’re committed to data-driven sustainability and encourage our customers to also use data to make a positive environmental impact.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.