Blog | Data Integration | | 5 min read

How Data Integration Can Help Reduce Technical Debt

Data integration

Whenever you create, install or implement a new process, system or tool in your IT environment, you generate technical debt that must be supported and maintained for the life of the component. The impact of a single connection may not seem significant but when you consider, however, the thousands (if not millions) of components, connections, applications, interfaces, and installations in your company’s end-to-end IT ecosystem, the technical debt that you must continuously service increases quickly.

With constant budgetary pressure from company leaders, reducing the connections debt load in your IT organization is essential to free up resources to drive innovation, to add new business capabilities and projects.

Simplification is the Key to Reducing Technical Debt

Some level of technical debt is a necessary and normal part of operating IT systems. You need tools and systems for your business processes to function and to enable your leaders to make informed decisions. Don’t think reducing your technical debt to zero is even a possibility.

If you achieve that goal, then your company is likely defunct. It is a good idea to understand technical debt, how it is created, what you must do to manage it and how you can make better solution decisions to limit the technical debt you must carry. Without delving further into technical debt theory, an important concept to understand is there is a cost to maintain each component or point-to-point connection in your IT environment. Reducing the complexity of your IT systems (simplification) also reduces your technical debt.

The Impact of Data Integration on Technical Debt

Most IT practitioners easily understand the concept of technical debt. IT asset management (ITAM) processes focus on cataloging and managing the objects in your environment and provide mechanisms for calculating and reporting on their total cost of ownership (TCO). ITAM, however, doesn’t do a good job of helping you understand what connections exist between components and how the web of connections and data integrations impact your company’s technical debt.

One service management expert who specializes in this area estimated data connections account for as much as 40% of the TCO of a typical company’s IT systems. This includes the cost of maintaining/updating connections, resolving incidents when data connections fail, addressing data security breaches related to unsecure connections and the cost of updating data connections when an IT system is replaced.

Companies need to consider an easy to use integration platform.  Creating and maintaining point to point connections is expensive, time consuming and unsustainable. As connections increase, your technical debt also increases. Companies need a iPaaS platform that is easy to use, serve as a data integration platform and  that allows you to design, deploy, maintain and monitor integrations quickly. This will deliver lower TCO by reducing technical complexity and technical debt.

Changing Your Data Integration Approach Can Reduce Technical Debt and Save Your Company Money

The cost of maintaining and operating data integrations is directly proportional to the number of your connections and whether they are managed in a central place or individual systems. Most IT environments and systems that have been operating for more than 10 years manage data integration by establishing point-to-point connections between components in the IT environment. Applications are connected to other applications to enable workflows. Applications are connected to databases to store and manage informations.  Data warehouses are then connected to a bunch of source systems. In even small IT environments, you are likely managing a complex web of integrations, many of them redundant and with no capability for centralized management.

During the past few years, modern hybrid integration platforms, such as Actian DataConnect, have replaced this point-to-point integration approach. Actian DataConnect operates more like a data integration hub. All your IT systems are connected into the integration platform, which manages authentication, workflow and authorized access and provides the tools to optimize how data flows throughout your IT environment. DataConnect allows you to connect, collect, transform and syndicate/distribute data to other target systems or systems  of engagement.  Now, you have centralized management and are able to reduce the number of individual data integrations by as much as 70% compared to the legacy approach.

Do You Need to Update Your Current Data Connections?

Most organizations are using data integration platforms to transform their IT systems and transform their legacy application infrastructure; however, they are also carrying a technical debt load related to the legacy point-to-point connections between systems deployed years ago. Because data integration contributes such a high percentage to a company’s overall TCO of IT systems, and because moving to a modern integration platform can have a large impact on reducing these costs, for most organizations, there is a clear return on investment to justify upgrading legacy connections and not waiting for components to proportional.

In addition to cost savings, centrally managed data connections are more secure, can be updated more frequently, and provide data security audit capabilities unavailable from point-to-point connections. To learn more, visit DataConnect.


Blog | Data Intelligence | | 2 min read

What is the Difference Between Metadata and Data?

the difference between metadata and data

“Data is content, and metadata is context. Metadata can be much more revealing than data, especially when collected in the aggregate.”

— Bruce Schneier, Data and Goliath

Definitions of Data and Metadata

For most people, the concepts of Metadata and Data are unclear. Even though both are forms of data, their uses and specifications are completely different.

Data is a collection of observations, measurements, facts, and descriptions of certain things. It gives you the ability to discover patterns and trends in all of an enterprise’s data assets.

On the other hand, Metadata, often defined as “data on data”, refers to specific details on this data. It provides granular information on one specific data, such as file type, format, origin, date, etc.

Key Differences Between Data and Metadata

The main difference between Data and Metadata is that data is the content that provides a description, measurement, or even a report on anything relative to an enterprise’s data assets. On the other hand, metadata describes the relevant information on said data, giving it more context for data users.

Data can be processed or unprocessed, such as raw data (numbers, or non-informative characters). The difference with metadata is that it is always considered to be processed information.

Finally, some data is informative, and some may not be. However, metadata is always informative as it references other data.

Why is Metadata Important for Data Management?

When data is created, so is metadata (its origin, format, type, etc.). However, this type of information is not enough to properly manage data in this expanding digital era; data managers must invest time in making sure this business asset is properly named, tagged, stored, and archived in a taxonomy that is consistent with all of the other assets in the enterprise. This is what we call, “metadata management.”

With better metadata management comes better data value. This metadata allows for enterprises to assure greater data quality and discovery, allowing data teams to better understand their data. Without metadata, enterprise find themselves with datasets without context, and data without context has little value.

This is why having a proper metadata management solution is critical for enterprises dealing with data. By implementing a metadata management platform, data users are able to discover, understand, and trust in their enterprise’s data assets.

Are you looking for a metadata management solution? Contact us!


Blog | Data Management | | 5 min read

Actian Datacast: Is Your Enterprise Maximizing the Value of its Data?

maximize data through hybrid data trend

Last week, we announced the findings of the Actian Datacast 2019: Hybrid Data Trends Snapshot, a survey of 303 IT professionals with influence or decision-making ability at their companies of 250+ employees which found that more than half of enterprises today are unable to efficiently manage nor effectively use data to drive decision-making, among other trends.

In order to delve into each of the four key trends identified in the survey and highlighted in the Datacast Infographic, we are releasing a series of in-depth blog posts, which you can find and follow along with on Data at Work, the official Actian blog. As the second installment of our blog series, this article explores the importance of fully maximizing all the data generated by and available to enterprises to gain valuable insights and the conundrum of unused, stale data.

Maximizing the Value of Data: IT Decision Makers (ITDMs) Said on Average They Are Only Harnessing 54% of the Data Available to Them to Gain Valuable Insights

Modern enterprises are generating data at an unprecedented rate, but aren’t taking advantage of all the data available to them in order to drive real-time, actionable insights. Today, data is constantly in motion – it’s being generated, harnessed, and analyzed in real-time. Gone are the days of data at rest and stagnant data lakes – enterprises need to consume data rather than just store it, and ensure all of the data accessible to them is leveraged. 

Businesses that leverage more of their data sooner and more frequently to generate actionable insights will outpace less agile competitors. Competition between enterprises using the data-driven insights available to them will establish new winners and losers in every business; however, many businesses are throwing away the insights they aren’t able to unlock due to time, money, or resource constraints.

The Survey Found That 84% of Enterprises Would Deploy More Data if it Were Cheaper and Easier to Do, and 50% of These Businesses Say Data Complexity Issues Due to Siloed Applications Are a Top Barrier to Entry for Accessing Data and Gaining Effective Real-time Insights

When businesses take the necessary steps to fully leverage their data, such as implementing modern IT infrastructure, they become agile, competitive and able to provide a superior experience to their customers.

Only 25% of Enterprises With Access to the Data They Need, Have the Freshness or Recency of Data They Desire

In addition to fully harnessing and analyzing available data, the speed at which this is performed is critical. Enterprises need to pursue data architecture that will enable all their unique data-related ambitions to be processed at the speed of business. This means being able to bring analytics capabilities to all the places where their data already lives and enjoy the highest levels of query performance across the totality of their data (even hundreds of terabytes) is becoming a data architecture prerequisite.

As AI and machine learning become more actively involved in defining user experience, the lines are blurring between traditionally separate transactional databases and data warehouses used for analytics.

Thus, the role of “real-time” data in the enterprise goes beyond internal reporting and actionable insights and is beginning to shape user experience. User experience innovation has already become the most disruptive force in business history, with many upstart software companies devouring their incumbent competitors. In the near future, many more enterprises will leverage data to differentiate and win with superior customer experience.

Data-driven insights derived from fresh and available data are crucial to execute on this strategy.

Only 34% of Enterprises Using Data to Drive Decision-Making Are Using it to Drive Breakthrough Insights and Innovations vs. Business as Usual Operational Reporting

For many enterprises, data is being used for business-as-usual purposes, not to transform the business or provide competitive advantages, as it has the potential to do. While business-as-usual operations keep enterprises running from day-to-day, limiting data to operational reporting tasks means missing a key piece of the data puzzle – new insights that lead to awareness of products, markets, consumer trends, strategy and more.

Data is being generated in the enterprise that is not being put to good, strategic use, and the risks of missing out on these opportunities pose serious and immediate risks to enterprises.

Gaps in the system take engineers weeks or even months to bring forth something actionable for a company’s wider team to pursue, rather than the real-time insight needed for the current pace of business.

Maximizing Data For a More Strategic Future

Enterprises are increasingly demonstrating a strategic business need for hybrid data-based insight, enabling a data-driven process to store, access and analyze data wherever the business need is and wherever compliance requirements demand – both on-premises and across multiple clouds.

Enterprises equipped with data management architecture that can deliver these capabilities and help them access actionable insights from the full set of fresh data available to them in real-time will be poised to outpace competitors and fully maximize on their data and opportunities in the market.

Want to learn more about the trends uncovered in the survey? Check out our infographic mentioned above and linked here.


Blog | Data Integration | | 4 min read

How Often Do You Update Your Data Connections?

Data Connections

Modern IT environments have many connections between individual systems. Operational workflows, application-to-database links, data warehouse feeds, analytics/reporting tools, and published APIs are just some examples of these data connections.

They’re required to enable data to flow smoothly throughout your organization, driving business process effectiveness and informed decision-making.  After you create a data connection, how often are you updating it? For most companies, the possible answers are never, when a security breach happens, or when a system is replaced. That isn’t good enough!

If you want to keep your IT environment secure and running efficiently, and derive the most value from your data, you must have a consistent policy and process for updating and reviewing all of your data connections at least twice per year, and ideally every few months. Frequent audits of your data connections will help you identify where your IT environment is changing (even if your change-control process missed it), understand where and how data is being used throughout your business, and reduce the risk of unauthorized access to sensitive company and customer data.

The following are three key reasons why you must update your data connections regularly.

Keeping Your Data Secure

Data connections create two information security risks. The first is unauthorized access to data.  Stored passwords embedded in applications and other IT systems are easy for hackers, disgruntled employees, and others to obtain and use to access the data in your source systems.  This can both compromise your data integrity (potential for fraudulent behavior) and increase the risk of data breaches (like those in the news every month).

The credentials from your data connections can also be used to create new connections that may not conform to your company’s security protocols. Updating your data connections and changing credentials frequently makes it more difficult for your data connections to be hijacked and used for unauthorized purposes. If a connection is compromised, then a password change can help close the window of opportunity for the vulnerability to be exploited.

Validating Data Use

Reviewing and updating data connections periodically provides an excellent opportunity to validate and update your understanding of how data is flowing throughout your organization and its use. In addition to updating credentials, your data connection audit should include a review of what data is flowing through the connection, how often and what downstream business processes/activities are dependent on the data. This is important because it helps you understand when data connections have become obsolete and can be retired, consolidated or streamlined. Simplification of your data connections can help data flow faster, reduce the overhead cost of maintaining unused data and ensure your data governance policies are being enforced.

Leveraging New Data When it is Available

Most IT systems are continuously evolving with new versions of software platforms, applications, and analytics systems, causing changes to the data available for other systems and processes to use. Updates to your data connections should include a review of what data is available to downstream systems and informed decisions about what data to expose and/or replicate through each data connection.

For example, a new version of a CRM application may include additional attributes about a customer that could be used to improve marketing analytics and customer-segmentation reporting for product development. A review of your data connections provides an opportunity for you to identify changes to data schemas or content and decide what elements and attributes should be included in your data connections.

Regular review and updates of your data connections can be a time-consuming process if you are managing many point-to-point interfaces between components in your IT environment. Leveraging a hybrid integration platform as a data hub, such as Actian DataConnect, can make this task much easier (and cheaper).

With Actian DataConnect, you’ll be able to manage all the data connections throughout your ecosystem in a centralized place. You can see your connections easily, when they were last updated and what data is flowing, and how often. With good data-stewardship practices and the right set of integration tools, you will be able to keep your data secure and flowing smoothly. To learn more, visit DataConnect.


Blog | Data Analytics | | 4 min read

Actian Datacast: Are You Leveraging the Right Data for the Right Decision-Making?

real time analytics and real time data

Last week we announced the findings of the Actian Datacast 2019: Hybrid Data Trends Snapshot, sharing insights into the current challenges as well as opportunities for data-driven enterprises around managing hybrid data environments.

Our survey polled 303 IT professionals with influence and or decision-making ability at their companies of 500+ employees.

As the first installment of our blog series exploring the four key trends that emerged from our research, we will be deep-diving into challenges and opportunities around access to data.

Access to Data Is Limited: Half (51%) of End-Users Are Not Getting Data at the Moment They Need it

There’s an influx of data being generated, but half of the enterprises lack the resources to access it and use it in real-time. Our data shows that over 4 in 5 IT decision-makers (ITDMs) say one of the most painful parts of data analytics is how long it takes to deploy, yet businesses that can leverage more of their data sooner and more often for actionable insights outpace less agile competitors.

Almost 3 in 5 ITDMs say they have lots of data and lots of technology, but don’t believe it’s making any difference to their business. Being able to act on data at the moment is paramount to transforming business outcomes and improving the chances of business success.

Over time, data-driven advantages will establish who the key players are in every business category.

It’s imperative for long-term success to pursue the data architecture that can enable all of an enterprise’s unique data-related goals and objectives. This means being able to bring analytics capabilities to any place where a company’s data already lives, whether on-prem or in the cloud.

Organizations should be able to access the highest levels of query and ad-hoc analytics performance across the entirety of their data, and they should be able to do this while easily enforcing any required data privacy and governance policies.

Data That is Available is Not Fresh or Current: Only 26% of End-Users Are Fully Maximizing the Potential Actionable Insights From Their Data

Data is being generated in the enterprise that is not being put to good, strategic use. Gaps in the system take engineers weeks or even months to wade through and bring forth something actionable. Slower decision-making is only one consequence of having to wait for data to become available for analysis, though.

Modern, aspirational analytics use cases, like customer-360 and hyper-personalization, simply don’t work with stale data.

As ML and AI become more actively involved in defining user experience, the lines are blurring between traditionally separate transactional databases and data warehouses when it comes to the need to feed data into algorithms that are making or supporting real-time decisions and automation.

Therefore, the role of “real-time” data in the enterprise goes beyond internal reporting and insights and now begins to shape the customer experience, manufacturing and logistics operations, and hosts of other mission critical use cases.

Data complexity creates a barrier to entry here, though. Over two in five (45%) say the complexity of real-time data and big data present a challenge when looking to harness their data. This is largely due to the time and expense of data processing and preparation inherent in more traditional, batch-mode siloed data collection and warehousing.

Modern analytics for the enterprise are about harnessing all data, from all sources – applications, transactions, CRM and beyond. These need to be harnessed – fused together – under a common framework that can support all the demands of reporting, insight generation, and increasingly predictive analysis and decision support a business may have.

In particular, as the type and depth of insights and predictive support become the focal point, demands stemming from the operationalization of ML, AI, and algorithms within more industries and companies will require fresh hybrid data.

Looking for a Path Forward

Enterprises have long chased the promise of big data and how to leverage it to propel their businesses forward. However, what we’re currently seeing as an outcome of this chase is a lot of companies drowning in data. With the focus zeroing in on getting the most data possible, businesses have become engulfed by the sheer amount of data and are actually getting pulled further away from their data goals and aspirations as a result.

Businesses are looking for a clear path forward around how to collect, analyze, manage and use their data in the most effective ways – stay tuned for parts 2-4 of this series as we take a deeper look at this.

View our infographic.


Blog | Data Analytics | | 1 min read

Actian Datacast IT Decision-Maker Survey

Data Cast Survey 2019 - Hybrid Data Trends

We have just announced the findings of the Actian Datacast Hybrid Data Trends Survey which offers new insights into the current challenges and opportunities for data-driven organizations when it comes to managing hybrid data environments.

The survey polled 303 IT professionals who have some influence or decision-making ability and work in a company with at least 250 employees.

The infographic below outlines four key trends that emerged from this research around access to data, maximizing the value of data gathered and compliance.

In the coming weeks, we will be publishing a four-part blog series that explores each trend in greater detail, shares more data from the research and illustrates what this means for the broader industry.

You can find more information about the Actian Datacast Survey in today’s announcement here.


Blog | Data Integration | | 4 min read

If ETL is Integration Hell, How Do I Avoid It and Go to Heaven?

ETL

Extract, Transform, and Load (ETL) is the process that has been used to share data between applications, transactional systems, and data warehouses for decades. It essentially works like this; you define an integration, pull the data out of the source system, use some mapping and aggregation rules to transform the data into the format needed by the target system and then you load (save) the data in the target system’s database.

ETL Hell

While this process seems simple and intuitive, it has a few problems that are leading many companies to question the sustainability of the practice. For solution and data architects, ETL can quickly become an integration hell.

  1. The need to pre-define what data needs to move between the systems and what transformations need to take place.
  2. Moving more data than you need.
  3. The complexity of tracking data through multiple systems.
  4. The effort/cost of keeping ETL processes up to date as source and target systems change.
  5. The security vulnerabilities exposed during the ETL process itself.

ETL works great in situations where you are defining a system or a set of integrations that will be stable for a long time. That isn’t the reality for most modern business-IT ecosystems. The push for business agility has caused applications and business processes to change rapidly, thereby increasing the cost of integration between applications. This application data integration churn is difficult for ETL solutions to support.

Significantly Reduce Your ETL Burden

The good news for the IT industry is that there are now ways to reduce your use of ETL and help get your staff out of ETL hell. You can do this by relying on three key principles:

  1. If You Can Use the Data Directly From the Source System, Don’t Copy It At All. Much of the system integrations and ETL setups that have been built over the past few decades were developed as a workaround to compute capacity and performance in individual applications. Transactional data was moved out of source systems and into data warehouses for reporting in order to avoid analytics processes slowing down transactional workflows. With compute now being both fast and cheap, often your transactional systems can handle processing analytics and new transactions at the same time without a measurable performance impact.
  2. Only Move the Data You Need When You Need to Use It. Transition from pushing data downstream to pulling data at the time of consumption. This not only lowers the amount of data that gets copied amongst systems, but it ensures that the data that your users and business processes consume is as current as possible. When you push data through a system, you develop the challenge of keeping the target data up to date with changes in the source system. By pulling data when you need it, any changes have already been applied.
  3. Plan for Change. Where ETL was designed for stability, modern IT environments are designed for agility. That means you need to move from fixed, pre-defined integrations and ETL definitions towards a solution that centralizes your connection management and makes data available across the enterprise. This may be an operational data warehouse, or it may be simply an enterprise data bus. What you are looking for is flexibility and the ability to reconfigure your flow of data whenever business needs or systems change.

Moving out of ETL hell and finding a solution that feels more like data heaven starts with developing a more agile mindset about how data flows across your organization. Don’t assume you’ll know in advance what your business will need or assume that the systems you have today will be the systems you have in your IT environment tomorrow. Look for modern data management platforms like Actian that will enable you to manage your connections in a consistent way, aggregate your data for use across the enterprise, and provide the analytics tools to develop the insights you need today and a new set of insights tomorrow.


IoT devices create plenty of data – much more that you might think. When you multiply this amount of data by the number of devices installed in your company’s IT ecosystem, it is apparent IoT is a truly big data challenge.

Two common questions IT staff and data management professionals have been asking during the past few years are: “Does it make sense to push all your IoT data to the cloud?” and “What are the pros and cons of doing so?”

Why Would You Move IoT Data to the Cloud?

There are three primary reasons why companies move their IoT data to the cloud for processing.

  1. Aggregate data from multiple devices in one place.
  2. Leverage cloud-scale compute to process the data.
  3. Enable business analytics and decision-making.

As you might imagine, these reasons are not entirely independent. Part of what makes IoT such an attractive technological component is its simplicity. IoT devices aren’t highly sophisticated, don’t contain much internal storage and typically aren’t capable of complex data processing.

Consequently, they can be less expensive to acquire and deploy and operate with low power consumption. IoT devices work best when coupled with cloud services of some sort where data from the devices can be collected, transferred somewhere for processing and then response signals are sent back to the device. As their name suggests, IoT devices are designed to be networked and interact with other technological components (they aren’t as powerful on their own as they are as part of a system).

Benefits of Moving Your IoT Data to the Cloud

Yes, it’s great that you can move IoT data to the cloud for processing, but it is never a good strategy to do something for the sake of doing it – there should be a purpose. In the case of IoT data, that purpose is to enable decision-making – either strategic or operational.

IoT devices provide a bridge between the digital world and the physical world by capturing information about the environment, so remote consumers can make decisions and initiate actions. Those actions may be as simple as turning a light on or off or part of a complex operational process (such as a manufacturing system).

IoT devices enable users to interact with physical environments remotely. To do this, the data and capabilities of the IoT device must be made available to the remote user. The best method for this in an enterprise environment is to move the data to the cloud, so whoever needs it may access it. Often, IoT devices work as components of a greater system or workflow. Once data is in the cloud, it can be merged with other data, analyzed for meaning and relevance and, in some cases, used to drive automation. There are many business problems IoT can help companies solve.

Drawbacks to Moving Your IoT Data to the Cloud

Using IoT data to solve business problems can create tremendous value for a company in pursuit of digital transformation or enterprise business agility goals. Unfortunately, not all the data IoT devices produce is useful and valuable – some of it is just noise. The two biggest issues with managing IoT data are:

  1. The large volume of data produced.
  2. Sorting the meaningful information from the data clutter.

These two issues are of importance when evaluating what data to move to the cloud, as you want to avoid adding clutter to a data warehouse or clogging your infrastructure with the transmission and processing of data that you don’t intend to use. A good rule is to focus on determining what data you actually need and plan to use.

Because IoT data is most valuable in real-time, consider what data is needed to support your operational processes or real-time decision-making. Once you understand this, look at the data produced by your IoT devices and see how well it serves your needs. You may find you don’t have the right kinds of IoT devices in your environment or you only need a subset of the data being generated to make meaningful decisions.

Your company’s decision-making and operational needs will determine what IoT data to move to the cloud, how you aggregate data from disparate systems and who uses it within your organization.

The composition of your IT infrastructure, including, for example, security and cost models, may apply some additional constraints to what should be moved to the cloud vs. filtered at the source. If your company uses analytics tools that perform time-series analysis, then you may decide to move all your IoT data to the cloud for analysis and then filter it later.

IoT is positioned to assume an expanded purpose within the IT ecosystems of companies during the next few years, enabling them to move beyond basic digital transformation towards true business agility.

Actian provides a suite of solutions to help you manage the data on IoT devices to operational data warehouses, capabilities for processing large-scale data in near-real time, and the time-series analytics capabilities to understand the value of your IoT data for your company.


Blog | Data Intelligence | | 5 min read

Data Revolutions: Towards a Business Vision of Data

data revolutions

The use of massive data by the internet giants in the 2000s was a wake-up call for enterprises: Big Data is a lever for growth and competitiveness that encourages innovation. Today, enterprises are reorganizing themselves around their data to adopt a “data-driven” approach. It’s a story constituting several twists and turns that tends to finally find a solution.

This article discusses the different enterprise data revolutions undertaken in recent years up to now in an attempt to maximize the business value of data.

Siloed Architectures

In the 80s, Information Systems developed immensely. Business applications were created, advanced programming languages emerged, and relational databases appeared. All these applications stayed on their owners’ platforms, isolated from the rest of the IT ecosystem. 

For these historical and technological reasons, an enterprise’s internal data was distributed in various technologies and heterogeneous formats. In addition to organizational problems, we then speak of a tribal effect. Each IT department has its own tools and implicitly manages its data for its own use. We are witnessing a type of data hoarding within organizations. To back these suggestions, we frequently recall Conway’s law: “All architecture reflects the organization that created it.” Thus, this organization, called silos, makes for very complex and onerous cross-referencing of data originating from two different systems.

The search for a centralized and comprehensive vision of an enterprise’s data will lead Information Systems to a new revolution.

The Concept of a Data Warehouse

By the end of the 90s, Business Intelligence was in full swing. For analytical purposes and intending to respond to all strategic questions, the concept of a data warehouse appeared. 

To make this, we will recover the data from mainframes or relational databases and transfer them to an ETL (Extract Transform Loader). Projected in a so-called pivot format, analysts and decision-makers can access data collected and formatted to answer pre-established questions and specific cases of reflection. From the question, we get a data model.

This revolution always comes with some problem. Using ETL tools has a certain cost, not to mention the hardware that comes with it. The elapsed time between the formalization of the need and the receipt of the report is time-consuming. It’s a revolution that is costly for perfectible efficiency.

The New Revolution of a Data Lake

The arrival of data lakes reverses the previous reasoning.  A data lake enables organizations to centralize all useful data storages, regardless of their source or format, for a very low cost. . We stock an enterprise’s data without presuming their usage in the treatment of a future use case. It is only according to a specific use where we will select these raw data and transform them into strategic information.

We are moving from an “a priori” to an “a posteriori” logic. This revolution of a data lake focuses on new skills and knowledge: data scientists and data engineers are capable of launching the treatment of data, producing results much faster than the time spent using data warehouses.

Another advantage of this Promised Land is its’ price. Often offered in an open-source way, data lakes are cheap, including the hardware that comes with them. We often speak of community hardware.

…or Rather, a Data Swamp

Certain advantages are present with the data lake revolution but come along with new challenges. The expertise needed to instantiate and to maintain these data lakes are rare and thus, are costly for enterprises. Additionally, pouring data in a data lake day after day without efficient management or organization brings on the serious risk of rendering the infrastructure unusable. Data is then inevitably lost in the mass.

This data management is accompanied by new issues related to data regulation (GDPR, Cnil, etc.) and data security: already existing topics in the data warehouse world. Finding the right data for the right use is not yet an easy thing to do.

The Settlement: Constructing Data Governance

The internet giants understood that centralizing these data is the first step, however insufficient. The last brick necessary to go towards a “data-driven” approach is to construct data governance. Innovating through data requires greater knowledge of these data. Where are my data stored? Who uses them? With which goal in mind? How are they being used?

To help data professionals chart and visualize the data life cycle, new tools have appeared: we call them, “Data Catalogs.” Located above data infrastructures, they allow you to create a searchable metadata directory. They make it possible to acquire a business vision and data techniques by centralizing all collected information. In the same way that Google doesn’t store web pages but rather, their metadata to reference them, companies must also store their data’s metadata in order to facilitate the exploitation of and discovery of them. Gartner confirms this in their study, “Data Catalog is the New Black”: if your data lake’s data is without metadata management and governance, it will be considered inefficient.

Thanks to these new tools, data becomes an asset for all employees. The easy-to-use interface doesn’t require technical skills, becoming a simple way to know, organize, and manage these data. The data catalog becomes the reference collaborative tool in the enterprise.

Acquiring an all-round view of these data and to start data governance to drive ideations thus becomes possible.


Over the past few years, we’ve seen an increasing trend of regional governments applying unique restrictions and controls on where data is stored and how it is managed for users and businesses in their jurisdiction. The EU and Japan have recently imposed some strict rules about data export.

A cloud data warehouse can be an effective tool in helping your company remain compliant with regional regulations by keeping data within the region it was created.

Distributed Architecture of the Cloud

Cloud infrastructure is different from on-premises data centers in that it is inherently designed to support distributed systems. This could be either the replication of common capabilities to multiple regions or the localized deployment of specialized capabilities to a specific regional audience. Cloud management platforms include the tools to be able to define regions and effectively direct transactions (and data) to technical resources aligned with your unique business needs.

Separation of Applications and Data Stores

In the past, there was typically a 1:1 relationship between applications that users interact with and the databases where transactional data is stored. If you wanted to keep the data from one region localized in that region, you needed to create a new database and a new version of the application to connect to it. Modern cloud architecture changes that. You can now have a single application that is replicated globally and use routing rules to connect the application to different databases, either based on performance criteria or geopolitical rules.

The Impact of Data Localization Rules on Data Warehouses

One of the unique challenges that companies have faced when seeking conformance to data localization regulations is the need to not only store transactional data within the local region but also to control data being copied and exported outside of the source jurisdiction. This means that if companies want to perform deep analytics and data mining, they may need to do these activities in data warehouses that are in a specific region. When data warehouses were predominantly hosted on-premises, this was a costly proposition – acquiring a data center, standing up a data warehouse, operating it, and performing analytics within a specific region.

Data Warehouses in the Cloud

The cloud makes deploying localized data warehouses easier and cheaper than on-premises alternatives. The same cloud providers that are hosting localized applications can provision a data warehouse in-region with both the compute and storage capacity for performing business analytics and data mining. While the raw data is often export-controlled, there are often fewer restrictions (if any) on exporting the analytics and insights derived from data to decision-makers at a corporate headquarters in a different region. Companies can now distribute their data warehousing capabilities across the globe the same way they are distributing applications and transactional data stores, while at the same time retaining the ability to do high-performance processing and achieve centralized information insights.

Actian is the leader in hybrid data warehousing that span on-premises and multiple cloud platforms. Actian Data Platform provides a full-featured cloud data warehouse solution that can be deployed quickly – either globally or to a targeted region, helping your company achieve superior business value, fast time to market, and sustainable operational costs.  To learn more, visit www.actian.com/data-platform.


Most modern companies have fully embraced the cloud as the preferred place to host the applications that their businesses use.  Many new applications are “cloud-native” or are designed specifically for operating in the cloud, while many of the legacy applications that your company uses today may have been designed to run on hardware in your company’s data center, i.e., on-premises.  The good news is that even if you have legacy applications, most software providers are now offering cloud-hosted options for their solutions.

Before you make the move to migrate legacy apps to the cloud, there are a few things you need to consider.  Use the following checklist as a tool to help you understand if cloud migration is right for your app, and if so, how you can increase the success of the migration process.

  1. Does the Cloud App Have All the Functions You Need?
    While most commercially available software packages that were developed for on-premises hosting now have cloud equivalents, the first thing you should look for is functional parity.  To accelerate time to market, some software providers have only implemented a subset of their tool’s functionality in their cloud offerings.  Now is a good time to review not only what features you have in your applications but also how they are being used by your users and business processes.
  1. Can Your Users Access the Cloud Application?
    One of the most significant downsides of cloud apps is that they require connectivity from the end user’s device to the cloud services in order for the app to work.  If your users need to be able to leverage the application when traveling to remote areas where reliable internet connectivity is unavailable, this can be an issue.  Connectivity can also be a challenge in corporate environments with robust network security protocols and firewalls that restrict access to external resources.  Enabling users to access cloud apps may require configuration to your network’s access control configurations.
  1. Where is the App Hosted?
    Many people don’t understand that the cloud is an extensive network of data centers run by large service providers.  In this network, applications may be hosted in a specific data center or region, or they may be replicated to operate in data centers across the globe.  It is important to understand where your users are located and use this information to guide your cloud deployment.  If your users are all in one city or region, it may be sufficient to have your app hosted only in that region.  If you have users in many different regions, more of a global footprint may be needed.  What you want to avoid is having you users in one region and your app hosted in another.  Network latency can cause significant impacts on your end-user’s experience related to app performance.
  1. Where is the Data Stored?
    In on-premises applications, typically the database that the application uses is co-located with the web or app server that brokers user transactions.  With cloud implementations, this is not always the case.  You may have application instances distributed around the globe, but leverage a centralized database to record transactions. Depending on the nature of the application, this configuration could cause performance issues if the network latency between the app and its data store is significant. Some modern architectures afford the ability to store and process data within the application itself (wherever it is deployed).  This is an important issue for your solution architects to investigate.
  1. How Will You Manage Integration?
    Digital business processes often involve the use of multiple applications and many data sources. Integrating your cloud apps with other systems and data stores, both on-premises and in the cloud can be complex and difficult to maintain.  Legacy approaches that leverage point-to-point integrations are often not effective in cloud environments.  Consider the use of a hybrid data integration platform such as Actian DataConnect to help you solve this problem.  By managing your connections in one place, you can enable greater flexibility in your deployment strategies.
  1. Do You Need to Replicate Data to Your Cloud Data Warehouse?
    When looking at your data integration needs for a cloud app, consider what (if any) of the application’s data needs to be replicated into your data warehouse for data mining and detailed analytics. A cloud data warehouse can be a powerful part of your cloud strategy, enabling less data to be retained locally (a key cost driver), improved analytics capabilities, and easier data retention should you decide to replace the cloud app in the future

As you can see from this checklist, there are some basic things that you need to evaluate to determine whether the migration of your application to the cloud will meet your business needs.  Some considerations can help increase business value once your cloud migration is complete.  The choices you make about how your application data is managed are important to both achieving sustainable value for your business as well as enabling agility for the future.

Actian DataConnect is a hybrid data integration platform designed to help companies manage the connections between applications, platforms, and data sources across the enterprise. By managing your connections in one place, you can lower your operational costs, improve security, and enable business agility. To learn more, visit DataConnect.