The New Focus on Data Integration


As I make the rounds at conferences, I’m often asked about the state of data integration.  Many believe that data integration is a problem solved long ago, and that it’s pretty much an automatic part of the fabric of IT these days.

Nothing could be further from the truth.  While data integration is much easier than it was 10 years ago, it’s still a complex task that requires some very good technology.  Indeed, data integration technology is strategic to the success of the company, and requires some planning and consideration.

The core reasons for leveraging data integration years ago, including the ability to exchange information between silo-ed systems and databases, are still the core problems to solve.  There are a few new issues that are driving many enterprises back to the data integration drawing board, such as:

  • The need to deal with more complex data, including data leveraged for deep analytical services.
  • The need to deal with the emerging use of big data systems, many that leverage both structured and unstructured information.
  • The rise of noSQL databases that are typically leveraged for a single purpose, such as data analytics.
  • The rise of public and private cloud-based resource usage that requires the synchronization of systems and databases with traditional IT resources.

These problems lead to enterprises take another look at their existing data integration strategy and technology.  In a few cases, data integration was never leveraged, and data moved from place to place using a hodgepodge of ad hoc approaches that led to ineffective solutions.

The bottom line is that our data environments are becoming more complex and distributed, and this trend will continue for at least the next 10 years.  Enterprises will continue to see the need and importance of data integration solutions, and thus it continues to be a priority in most IT shops that think proactively.

There are a few things to remember as you approach data integration for the second or third time.  Think about the fact that we now live in a low latency or a real-time world, so traditional approaches to data integration, where information is moved daily or weekly, may not cut it anymore.

Real-time data integration requires that you think differently about many things.  Data is no longer transferred and manipulated in larger chunks.  Now the changing of data occurs as the data moves from place to place.  Thus, the integration technology you select has to be reliable, and be able to deal with exceptions through a very robust exception management layer.

The volume of data continues to increase as well.  Many enterprises manage data that is well past a petabyte in size, and thus the amount of data that needs to move between systems and databases is increasing.  Data integration engines that are not set up to handle this volume of data will quickly bog down when the load increases.  You need to plan and test, insuring that the increasing volume won’t stop your data integration solution in its tracks.

The security and governance requirements are also changing.  They need to be built into your data integration solution.  This means that your data integration technology can leverage new security approaches, such as identity management, which are more common in widely distributed environments.  This includes the increasing use of cloud computing.

Most service and data governance systems operate outside of your data integration systems, but the systems need to work and play well with this technology.  This includes data quality systems, as well as MDM.

Governance plays an increasingly important role with data integration, as more resources move to public and private clouds.  As data and compute become more distributed, cloud governance systems provide a “single pane of glass” to view and manage cloud-based resources along with traditional systems.  Data integration needs to be systemic to these solutions as well, in that the data needs to move between most of these systems.

The movement, or in this case, the re-movement, to data integration is healthy for most enterprises.  This technology is strategic to the foundations of enterprise IT.  Those who ignored this fact in the past will finally have a chance to solve the problem.  Those who need a refresher will have a chance to improve.  Either way, the benefit to the business is rather obvious.

About David Linthicum

Dave Linthicum is the CTO of Cloud Technology Partners, and an internationally known cloud computing and SOA expert. He is a sought-after consultant, speaker, and blogger. In his career, Dave has formed or enhanced many of the ideas behind modern distributed computing including EAI, B2B Application Integration, and SOA, approaches and technologies in wide use today. For the last 10 years, he has focused on the technology and strategies around cloud computing, including working with several cloud computing startups. His industry experience includes tenure as CTO and CEO of several successful software and cloud computing companies, and upper-level management positions in Fortune 500 companies. In addition, he was an associate professor of computer science for eight years, and continues to lecture at major technical colleges and universities, including University of Virginia and Arizona State University. He keynotes at many leading technology conferences, and has several well-read columns and blogs. Linthicum has authored 10 books, including the ground-breaking "Enterprise Application Integration" and "B2B Application Integration."

View all posts by David Linthicum →

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>