Data Integration Dos and Don’ts

Share
Data Integration Blog

Many enterprises deployed some sort of data integration technology within the last 20 years.  While many enterprise insiders believe they have the problem solved, most don’t.  My advice?  There needs to be a continued focus on what the technology does, and what value it brings to the organization.

Data integration is not something you just drop in and hope for the best.  There needs to be careful planning around its use.  IT is the typical choice to do the planning, select the technology, and for ongoing operations.

However, the need for data integration typically comes from outside of IT.  Those who understand that data should be shared between systems, as needed and when needed, in support of core business processes, are typically the ones crying for more and better data integration technology.  IT responds to those requests reactively.

Now things are changing more quickly than they have in the past, including new impacts on IT as well as end users.  Specifically, these changes include:

  • The use of public cloud resources as a place to host and operate applications and data stores.  This increases the integration challenges for enterprise IT, and requires a new way of thinking about data integration and data integration technology.
  • The rise of big data systems, both in the cloud and on-premise, where the amount of data stored could go beyond a petabyte.  These systems have very specialized data integration requirements, not to mention the ability for the data integration solution to scale.
  • The rise of complex and mixed data models.  This includes no-SQL type databases that typically serve a single purpose.  Moreover, databases are emerging that focus on high performance, and thus need a data integration solution that can keep up.

To support these newer systems, those who leverage data integration approaches and technology have more decisions to make.  Indeed, these can be boiled down to some simple dos and don’ts.

Do create a data integration plan, and architecture.  No matter if you have existing data integration solutions in place or not, you need to consider your data integration requirements, which typically include lists of source and target data stores, performance, security, governance, data cleansing, etc..  This needs to be defined in enough detail that those in the IT and non-IT organization can both understand and follow the plan.  This should also include a logical and physical data integration architecture, as well as a detailed roadmap so the amount of ambiguity is reduced.

Do allocate enough budget.  In many cases, there are just not enough resources focused on the data integration problem.  If we do develop a plan, the tasks and technology in that plan need to be funded.  Lack of funding typically means data integration efforts die the death of a thousand cuts, and the data integration solutions don’t solve the problems they should solve.  That costs far more than any money you think you’re saving.

Don’t take the technology for granted.  Many enterprises believe that most data integration solutions are the same, and don’t spend the time they need should to evaluate and test data integration technology.  Available data integration technology varies a great deal, in terms of function and the problem patterns they can address.  You need to become an expert of sorts in what’s available, what it does, and how it will work and play within your infrastructure to solve your business problems.

Don’t neglect security, governance, and performance.  Many who implement data integration solutions often overlook security, governance, and even performance.  They do this for a few reasons.  Typically, they lack an understanding of how these concepts relate to data integration, and/or they lack an adequate budget (see above).  The reality is that these are concepts that must be baked into the data integration solution from logical architecture to physical deployment.  If you miss these items, you’ll have to retrofit them down the line.  This is almost impossible, certainly costly, and let’s not forget the cost of the risk you’ll incur.

While some of this seems obvious, most of what’s stated here is not followed by enterprise managers when they define, design, and deploy data integration solutions and technology.  The end result is a system that misses some of the core reasons for deploying data integration in the first place, and does not deliver the huge value that this technology can bring.

The good news for most enterprises is that data integration technology continues to improve, and has adapted around emerging infrastructure changes, including use of cloud, big data, etc..  However, a certain amount of discipline and planning must still occur.

About David Linthicum

Dave Linthicum is the CTO of Cloud Technology Partners, and an internationally known cloud computing and SOA expert. He is a sought-after consultant, speaker, and blogger. In his career, Dave has formed or enhanced many of the ideas behind modern distributed computing including EAI, B2B Application Integration, and SOA, approaches and technologies in wide use today. For the last 10 years, he has focused on the technology and strategies around cloud computing, including working with several cloud computing startups. His industry experience includes tenure as CTO and CEO of several successful software and cloud computing companies, and upper-level management positions in Fortune 500 companies. In addition, he was an associate professor of computer science for eight years, and continues to lecture at major technical colleges and universities, including University of Virginia and Arizona State University. He keynotes at many leading technology conferences, and has several well-read columns and blogs. Linthicum has authored 10 books, including the ground-breaking "Enterprise Application Integration" and "B2B Application Integration."

View all posts by David Linthicum →

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>