facebooklinkedinrsstwitterBlogAsset 1PRDatasheetDatasheetAsset 1DownloadForumGuideLinkWebinarPRPresentationRoad MapVideofacebooklinkedinrsstwitterBlogAsset 1PRDatasheetDatasheetAsset 1DownloadForumGuideLinkWebinarPRPresentationRoad MapVideo

ETL

Data integration what is it?

Data integration brings together multiple disparate data sources into a unified target data warehouse to support business decision-making. Components of a data integration solution include many of the following functions:

  • ETL capabilities to Extract, Transform and Load data from multiple source data sets to target data warehouses.
  • ELT (Extract, Load and Transform) technology to transform raw data within a data warehouse
  • Change Data Capture for detecting changes in source data and enabling replication to target data sets.
  • Workflow process automation.
  • Job scheduling for data flows.
  • Data replication to create and maintain synchronized copies of data.
  • Data deduplication capabilities.
  • Adapters for business data formats, including EDI, JSON, and ODBC.
  • Streaming data integration for sources such as Apache Kafka.
illustration of people working on multiple networked devices including computer and mobile showing the importance of data integration from many sources.

How does data integration work?

Data integration provides a holistic approach to populating data warehouses with dependable data. Once a business has decided what data is needed to support its decision-making, data integration tools can be used to identify raw data sources and unload, transform, move and upload that data into the target data warehouse. This is done in a systematic way, so sources are cataloged, data flows are scheduled, and any exceptions are handled.

Benefits

The benefits of data integration include:  

  • Scalability and high performance so more data can be delivered faster to enable timely decision making.
  • Data profiling features ensure the business is using appropriate mechanisms for the data type, data volume and cardinality.
  • For large data volumes, data transformation operations can be parallelized. 
  • Data quality can be assessed and managed.
  • Opportunities for data reuse can be identified to reduce the overall amount of data that needs to be moved. 
  • Data integration services use real-time integration techniques, which complement traditional ETL technologies.
  • Data flows can be scheduled centrally
  • Data exceptions can be identified and handled before they negatively impact business decisions.
  • Data use can be cataloged to provide data provenance to meet regulatory requirements.

Without data integration, data becomes siloed, and spreadsheet sprawl creates confusion about the most reliable data and results in poor decision-making.  

Illustration of large file cabinet with cloud and people working in the file cabinets drawers. Data integration brings sources together so that people can access it easily.

Why it is important?

If data quality is not managed, it can lead to low-quality decisions based on that data, resulting in unintended consequences for the business. In the absence of a formal data integration initiative, a business will operate without a common data integration solution, risking lower data quality and less confident decision-making. Operational considerations include wasted data movement, higher development times and an overwhelming data management problem in managing hundreds of ad hoc point-to-point integrations.

Data integration tools

Data integration tools have evolved to support on-premise and cloud deployment and support hub-based integration where data is staged centrally, and consumers subscribe to it. There are many open-source data integration tools and vended data integration tools available. 

The Actian DataConnect Integration Platform provides a powerful eclipse-based IDE with hundreds of built-in connectors and a universal adapter to create custom interfaces. Strengths include the ability to manage data flows, including scripts written for other vendors data integration tools to ease migration. 

Data integration vs application integration

Data integration is focused on combining data from multiple sources into a single data warehouse or data stage. Data integration jobs usually run in batches, periodically as streams.

Application integration is designed to orchestrate data flows between applications, serving as middleware between systems. Application integration actions happen immediately as events occur. Applications are mapped using fixed schemas that standardize column datatypes or values. 

The flow of data in data integration goes one way, from sources to an analytics database. Data integration is more straightforward as you don’t need a deep knowledge of connected applications. 

Visit our website to learn more about Actian data products and solutions.

 Explore our innovative data management, integration, and analytics solutions.