Data Management

Actian Developer Tools Available on Github

Actian Corporation

April 28, 2016

Actian developer tools available on github

The Actian technology teams have recently posted a number of technical tools and snippets to the Actian account on Github that will be of interest to customers, partners and prospects. We encourage all of you to take a look and make contributions of your own – either to enhance these tools, or else to let us know about other tools that you have created for yourselves, and we will mention them here. Our intention is to publish new contributions here over time, and to publish future Blog entries that go into more detail on some of these tools and contributions.

Examples of the projects you can already find on GitHub include:

  • The Actian Spark Connector for Vector in Hadoop (VectorH) is maintained here.
  • A Vagrant package that will take a downloaded Vector .tgz file and automatically install it into a freshly-built CentOS virtual machine.
  • A Unit Testing framework for OpenROAD.
  • A collection of scripts for testing VectorH alongside other Hadoop data analysis engines, referenced as part of a forthcoming conference paper.
  • A Maven-based template for creating new custom operators in Dataflow, together with a couple of examples that use this template, including a Dataflow JSONpath expression parser and an XML and XPath parser.
  • A utility called MQI which is designed to make it easier to run an operating system command across all of the nodes in a VectorH Hadoop cluster.
  • A collection of small Vector Tools that will do things like calculate the appropriate default number of partitions for a large table, look for data skew within a table, check whether the Vector min/max indexes are sorted or not (better performance if your data is sorted on disk and the min/max indexes will show this), and also a tool to take a collection of SQL scripts and turn them into a concurrent user throughput test, complete with some stats on overall runtime.
  • A collection of new operators for Dataflow to implement operations like passing runtime parameters into a Dataflow as a service, and a ‘sesssionize’ operator to group timestamped data into ‘sessions’, and a lead/lag node for handling timestamped data, and various others.
  • A performance benchmark test suite for Actian Vector, based on the DBT3 test data and queries. This project will create test data at a scale factor you choose (defaults to Scale Factor 1, which is around 1Gb of data in total), load that test data into Vector/VectorH, and then execute a series of queries and time the results.

Please take a look, download, and contribute to extend and enhance them to meet your needs!

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Management

Hadoop at 10

Actian Corporation

April 27, 2016

blurry screens showing numbers and data

Wow – 10 years of Hadoop, what a ride. Actian has been working in the Hadoop ecosystem almost since the beginning, starting in 2007. Actian started working with Hortonworks the moment they launched in 2011.

As a pioneer in this space, we have witnessed the whole “boom”. And while Hadoop is continuing to show healthy growth, and becoming a vital platform for business-critical analytic workloads, the customer mindset is clearly moving beyond the programmer-led early-adopter stage to an early-majority enterprise adoption stage.

With this inevitable maturing of the Hadoop marketplace, we see some painful “shake-out” looming, as these new customers demand the enterprise-class capabilities they have rightly come to expect. And not only will product expectations be increasingly enterprise-class, but vendors in the Hadoop space will be increasingly scrutinized for their own viability, as a proven track record of business success will turn out to be just as important as “cool products”.

This will make 2016 a very interesting year – to borrow a phrase from Warren Buffet, it is only when the tide goes out that you see who is swimming without a suit. Well, this year the tide of easy money and easy product promises is going out, and we may quickly learn who is “exposed”.

hadoop at 10 logo

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Insights

Analytics Cube and Beyond

Actian Corporation

April 14, 2016

blue lights coming up from a sphere

Big Data engineering has invented a seemingly endless supply of workaround solutions. These are for both scalability and performance problems. Ultimately, our approach to solving these problems dictates how sustainable they will be. This post compares some modern best practices with pre-processing cube environments. 

3 Ways a Hadoop Database Competes With Cube Analytics

  • Software design trumps pre-processing.
  • Data management capability is crucial.
  • Control data explosion and platform complexity.

Cube engines (OLAP) for analyzing data warehouses are nothing new, but applying them in today’s distributed data management architectures is. This post compares high-performance SQL analytics databases with Cube analytics in today’s distributed environment.

Enterprise cube platforms help users wade through massive amounts of data easily. However, with the operating system for data analytics moving to a massively parallel computing architecture with distributed data sources, it can be confusing whether or not cube-based analytics will help your next project.

Hadoop itself gives little help in getting high-performance analytics out of the box. While you may gain transparent access to a wide variety of data, you often sacrifice performance and other critical features such as data management and data expansion control – three key components for mastering the data lake as it grows.

This is why Hadoop adopters likely will not have cube analytics in the future, as “fast” databases bypass the overhead and complexity of maintaining a cube-based system.

Software Design Trumps Pre-Processing

Cube analytics are the bailout for bulging data warehouses or under-equipped analytic databases when they get too big, too heavy and too slow to keep up with running increasingly difficult workloads.  Why have these databases worked so poorly as data volumes scaled or required advanced hardware? Because legacy approaches to software engineering has limited performance improvements.

As a result, the industry is excited when there are merely linear improvements in their query performance when compared to hardware improvement. Are hardware improvements (à la Moore’s Law) accelerating all our queries that much faster or do we even notice?  I bet not. How does that make sense when, for example, chip vendors regularly add more ways to optimize processing?

Actian has taken advantage of several improvements in both hardware and software to power highly optimized analytic environments.

Leveraging Better Hardware

Simply, most systems do not take advantage of the inherent power of modern computing platforms. Put in faster disks (SSD), better RAM and improve the network connects – naturally things can run faster. Add cube analytics on top of that and you still improve performance, but only improved against legacy systems running on similarly designed architectures.

Modern databases that utilize the latest processor improvements (chip cache, disk management, memory sizes, columnar storage, etc.) all give a performance gain over the legacy approaches. These improvements show better than linear, often exponential, improvements over other popular solutions on the market. This is where Actian hangs its hat in the Big Data space (see Actian’s Vector on Hadoop platform).

We should all come to expect significant improvement between generations of both hardware and software. After all, today’s engineers are better educated than ever before and are using the best hardware ever developed.

If you aren’t leveraging these advantages then you’ve already lost the opportunity for a “moon shot” via big data analytics. You won’t be able to plug the dam with old concrete – you need to pour it afresh.

High performance databases are removing the limits that have strangled legacy databases and data warehouses over the past decades. While cube engines can still process on top of these new analytic database platforms, they are often so fast that they do not need the help. Instead, common Business Intelligence (BI) tools can plug into them directly and maintain excellent query performance.

Data Management Capability is Crucial

Back-end database management capabilities are crucial to any sustainable database. Front-end users merely need SQL access, but DBAs always need tools for modifying tables, optimizing storage, backing up data and cleaning up after one another. This is another area that differentiates next generation analytical databases and cube engines.

Many tools in the Hadoop ecosystem do one thing well – i.e. read various types of data or run analytical processes. This means they cannot do all the things that an enterprise database usually requires. Cube engines are no exception here – their strength is in summarizing data and building queries against it.

When your data is handled by a cube system you no longer have an enterprise SQL database. Granted, you may have SQL access but you have likely lost insert, update, and rollback capabilities among others. These are what you should expect your analytic database to be bring to the table – ACID compliance, full SQL compliance, vector-based columnar approach, natively in Hadoop, along with other data management necessities.

Closely related to data management is the ability to get at raw data from the source. With a fast relational database there is no separation of summary data from detailed records. You are always just one query away to higher granularity of data – from the same database, from the same table. When the database is updated all the new data points are readily available for your next query regardless of whether they’ve been pre-processed or not.

Data Management Matters!  

Data changes as new data is ingested but also because users want to modify it, clean it or aggregate it in different ways than before. We all need this flexibility and power to continue leveraging our skills and expertise in data handling.

Control Data Explosion and Platform Complexity

Data explosion is real. The adage that we are continually multiplying the rate at which data grows begs the question: how can we keep it as manageable as possible?

This is where I have a fundamental issue with cube approaches to analytics. We should strive to avoid tools that duplicate and explode our exponentially growing data volumes even further. Needless to say, we should also not be moving data out of Hadoop as some products do.

Wouldn’t it make more sense to engineer a solution that directly deals with the performance bottlenecks in the software rather than Band-Aid a slow analytic database by pre-processing dimensions that might not even get used in the future?

Unfortunately, cube approaches inherently grow your data volumes. For example, the Kylin project leaders have said that they see “6-10x data expansion” by using cubes. This also assumed adequately trained personnel who can build and clean up cubes over time. It quickly becomes impossible to estimate future storage needs if you cannot be assured of how much room your analytics will require.

Avoid Juggling Complex Platforms

Many platforms require juggling more technological pieces than a modern analytic database. Keeping data sources loaded and processed in a database is hard enough, so adding layers on top of it for normalization, cube generation, querying and cube-regeneration, etc. make a system even harder to maintain.

Apache’s Kylin project, as just one example, requires quite a lot of moving pieces: Hive for aggregating the source data, Hive for storing a denormalized copy of the data to be analyzed, HBase for storing the resulting cube data, a query engine on top to make it SQL compliant, etc. You can start to imagine that you might need additional nodes to handle various parts of this design.

That’s a lot of baggage; let’s hope that if you are using it, you really need to!

Consider the alternative, like Actian Vector on Hadoop. You compile your data together from operational sources. You create your queries in SQL. Done. Just because many Hadoop options run slow does not mean they have to and we don’t need to engineer more complexity into the platform to make up for it.

With an optimized platform you won’t have to batch up your queries to run in the background to get good performance and you don’t need to worry about resource contention between products in your stack. It’s all one system. Everything from block management to query optimization is within the same underlying system and that’s the way it should be.

SQL Analysts vs. Jugglers

The final thing you should consider are your human resources. They can handle being experts at a limited number of things. Not all platforms are easy to manage and support over the lifetime of your investment.

We work with a lot of open source projects, but at the end of the day we know our own product inside and out the best. We can improve and optimize parts of the stack at any level. When you use a system with many sub-components that are developed and managed by different teams, different companies, even different volunteer communities you sacrifice the ability to leverage the power of a tightly coupled solution. In the long term you will want those solutions to be professionally supported and maintained with your needs in mind.

From a practical standpoint I have aimed to show how many of the problems that cubes seek to solve are less of an issue when better relational databases are available.  Likewise, careful consideration is important when deciding if the additional overhead of maintaining such a solution is wise. Obviously this varies by situation but I hope these general comparisons are useful when qualifying technology for a given requirement.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Architecture

A Smart Interview With Peter Boncz on Vector Processing

Actian Corporation

September 5, 2014

chalkboard with writing and drawings of an analytical mind

Known as the Father of Vector Database Processing, Peter Boncz is the Senior Research Scientist at Centrum Wiskunde & Informatica (CWI), Professor at VU University Amsterdam, and Chief Technical Advisor to Actian Corporation, the first company to assemble an end-to-end big data analytics platform that runs natively on Hadoop. He architected two breakthrough database systems, MonetDB and X100 (aka VectorWise or Vector), now available as the world’s highest-performing native SQL-in-Hadoop offering executing in YARN.

In the linked Q&A interview (and in an in-depth podcast), Boncz covers the origins of analytical data processing, its progress through iterations, and the eventual acquisition of Vectorwise by Actian. The conversation continues through use cases and the adaptability of Vector to other distributed computing solutions such as Hadoop.

Introduced in 2010, the columnar, vectorized analytical database continues to dominate in its field. The Actian platform brings industrial-grade SQL to vector technology—Hadoop Distributed File System (HDFS) performance instrumentation, adaptable scale-out, and resource optimization via YARN.

While others claim to offer native SQL in Hadoop support, most have immature, low-quality SQL support and require you to move your data out of Hadoop. Faster than both Hive and Impala, only Actian Vector combines the most modern, scalable, high-performing database technology with the power of Hadoop, thereby enabling users to directly query data stored in the HDFS.

Read Peter Boncz’s very interesting outlook on big data analytics, solution capabilities, and the future of vector technology when associated with Hadoop. Or listen to the podcast. Either way, you will learn how vector processing is a key innovation to accelerating big data analytics.

The interview can be found at: https://www.techtarget.com/

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Quality

Never Underestimate the Importance of Good Data Quality

Actian Corporation

September 2, 2014

blue lights coming up from a sphere

One of the things I talk to organizations about regularly when they’re trying to get their heads around big data and analytics is the importance of data quality before they start with analytics. Here, I am reminded of the old database acronym of GIGO (garbage in, garbage out), and yet I am astounded by how many companies still skip this most important step.

For example, the other day I was talking to a business professional who could not understand why response rates to campaigns and activities were so low. Nor why they couldn’t really use analytics to get competitive advantage. A quick investigation of their data and their systems soon showed that a large section of the data they were using was either out-of-date, badly formatted or just erroneous.

Today we are blessed, and perhaps at the same time cursed, by the sheer number of solutions available that promise to make it easy for marketers to get their message out to an audience, and yet many of them do not help the fundamental issue of ensuring good data quality. And while there are also umpteen services that will append records to or automatically populate fields in your database for you, professionals are then dependent on them to offer good quality which is not always guaranteed. As I have stated before, data is such a commodity that it is bought, sold and peddled around the world like mad. It can soon become stale or end up in the wrong fields as it is copied and pasted from Excel sheet to Excel sheet. And yet, to turn it into an asset, you need to be able to get good insights and take the right action as a result.

I would also argue that very few businesses run manual data quality checks on their data. And yet they will spend thousands of dollars on maintaining servers and systems that are essentially still full of a large amount of poor data. It’s like owning a fast sports car, washing and polishing it every morning to make it look its best, and yet not service the car, leaving old oil in it and filling it with the cheapest, most underperforming fuel. What’s the point of that?

The reasons why pristine data counts for a lot are numerous. First, you’re not falling at the first hurdle. There’s no sense in crafting the best advertising campaign if the audience you’re sending it to just doesn’t exist or doesn’t see it. Moreover, for those in marketing, good data means potentially great leads for the sales team to follow up on.  Nothing annoys a good salesperson more than non-existent leads or those of poor quality. As a result, conversion rates will go up and you won’t need to keep pumping your database with replacement data.

Second, good data means less time spent hunting around for the right phone number or email or mailing address. How many times have you seen customer records with badly formatted phone numbers or erroneous email addresses? The importance here is also on the recognition that systems set up to accept US mailing addresses or phone numbers must also work beyond the borders of North America. Zip codes outside the US are not always numeric and telephone area codes are not always composed of three digits. Having a system that does not truncate field values to fit a US-model is therefore imperative.

But the biggest reason why all companies should exercise good data quality is in the eyes of many business leader the most important: money. Good data means less money wasted on poor campaigns and less money spent on trying to fix the issue later on. Plus, you’ll spend less money on your staff hunting for the right data. And what’s more, good data will lead to better conversion rates more quickly, so you can quickly find out what is working, what isn’t and then choose to spend your marketing and operational dollars, pounds, euros and yen more wisely on what counts and works first time.

Sure, you might argue that all this is good common sense, but I tell you that large chunks of poor data still exists in many systems out there in enterprises large and small.  It’s time to stop watching bad data rates just go up and start actively flushing out data in your system so you can then get the results you need to act on.

There are a few tricks here. First, don’t be afraid to delete bad data. I know many companies don’t like deleting data at all but what is the point of having your systems full of incorrect information. Second, design systems that can capture and have the most pertinent information in them, nothing more. It’s far better to have 15 fields that are all filled out correctly, than have 150 fields that no-one has the time nor the will to complete and therefore remain empty. And third, learn quickly from your mistakes and adapt behaviours accordingly. There is no point doing the same thing over and over again expecting different results. Einstein once said that was the definition of insanity.  And businesses cannot afford to be labelled mad.

So, before you start on your big data journey, before you start joining data together and before you start wanting to analyze data to act on it, it’s time to make sure your data is healthy and kept in good shape. If you want to run your business like a fast sports car, make sure you tune it, service it and give it the best care possible. And that starts with good data quality.

Why not start today and talk to us at Actian – we’ll help you understand your systems and data and how you can get the most value from it in the shortest of timescales.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Integration

Emergence of the Chief Data Officer Caffeinates Data Integration

Actian Corporation

July 14, 2014

Emergence of chief data officer Blog

We often watch new positions sprout up around emerging technology. These days, we see new titles such as the chief cloud officer, and other titles that seem just as trendy.  However, the strategic use of data within many enterprises led those in charge to assign data management responsibility to one person; the chief data officer or CDO.

I’m not a big fan of creating positions around trends in technology. Back in the day, we had the chief object officer, chief PC officer, chief Web officers, you name it. However, data is not a trend. It’s systemic to what a business is, and thus the focus on managing it better, and centrally, is a positive step.

Adding a CDO to the ranks of IT makes sense. The analyst firm, IDC, predicts the global big data technology and services market will reach $23.8 billion by 2016, while the cloud and cloud services market is expected to see $100 billion invested in 2014. We’ve all seen the explosion of data in enterprises, as the use of big data systems begins to take root, including the ability to finally leverage data for a true strategic business advantage.

The arrival of the CDO has a few advantages for larger enterprises. Appointing a CDO:

  • Sends a clear message to those in IT that data is strategic to corporate leadership, and that they are investing in the proper management and use of that data.
  • Provides a single entity to govern how data is gathered, secured, managed, and analyzed holistically.  The enterprise will no longer lock data up in silos, controlled by various departments in the company.
  • Provides a common approach to data integration. The CDO governs most of the data that needs to be governed, as well as how the data flows from place to place to place.

The role of the CDO will be around the strategic use of business data. Many enterprises will see this instantiated through projects to put the right technology in place, including emerging big-data systems that manage both structured and unstructured data, key analytics systems, and data integration systems to break down enterprise silos.

The use of data analytics is most interesting, considering that data analytics is all about understanding data in the context of other data. When the first generation of data warehouse systems first hit the streets many years ago, the focus was on taking operational data, placing it in another database model, and then slicing and dicing the data to cull out the required information.

When considering traditional approaches to analytics, the data was typically outdated, months or years old in many cases. Moreover, the approach was to analyze the data itself. We could analyze operational trends, such as increasing or decreasing sales, but we would not really understand the reasons for those trends.

Missing was the ability to manage data in the context of other data. An example would be the ability to analyze sales trends in the context of key economic indicators, or the ability to understand the collation of production efficiency in the context of the average hourly pay of plant workers. These are where the true data analytics answers exist. While they are complex and require true data science to find them, the arrival of the CDO means that the true answers will at least be on the corporate radar.

As these strategic analytics systems rise up within many enterprises, perhaps with the rise of the CDO, so does the focus on data integration. Data integration, like databases themselves, have been around for years and years. As we concentrate more on what the data means, in the context of other data, then there is a need to bring that data together.

In the past, data integration was considered more a tactical problem, something that was solved in ad-hoc ways using whatever technology seemed to work at the time. These days, considering the value of the strategic use of data, data integration has got to be a key best practice and enabling technology that allows the enterprise to effectively leverage the data.

In other words, where once there was much less energy around the use of data integration approaches and technology, these days, data integration is caffeinated. Perhaps that’s due in part to the arrival of people in the organization with both budget and power, who are now charged with managing the data, such as the CDO.

Of course, reorgs and the creation of new positions don’t solve problems. They just provide the potential to solve problems. With the arrival of the CDO comes a new set of priorities around the use of data.  Data integration has got to be at least number 1 or 2 on the priority list.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Integration

Retail Begins to Understand Seamless Integration

Actian Corporation

July 2, 2014

san jose hadoop world

It’s one thing to read about the use of data integration in a technology journal, but another thing altogether to see it in line-of-business publications. That was the case when I read an article from Multi-Channel Merchant, which covered the business reasons for retailers to leverage data integration.

“Modern software tools allow brick and mortar businesses to transition online without introducing a new business that must be managed separately. Seamless data integration and order fulfillment bridge the gap between physical and virtual channels, allowing all orders to be processed from a singular platform that accesses inventory.”

While I’m sure this seems revolutionary to those in retail, it’s actually a concept that dates back many years and includes technology that supports the integration of retail channels. The problem is that those in retail typically did not hear about data integration, or understand its value, and they are just beginning to figure out how much of a game changer it is.

The article describes “a system architecture that allows for live transactions with a single and centralized database is the primary driver for a successful approach to seamless omnichannel fulfillment.”  This configuration allows retailers to leverage and manage inventory across all channels and locations. The results are a more simplified and productive operation, and the ability to leverage the economies of scale across each channel.

Key benefits from data integration in this vertical, as covered in this article, include:

  • The ability for the retailer to leverage inventory across all locations.
  • The ability to simplify operations.
  • The ability to leverage economies of scale.
  • Faster order fulfillment.
  • Centralized inventory control.

“This integration provides customers from every channel access to the same inventory, minimizing sellouts and preventing lost sales resulting from backorders and other fulfillment delays. Retailers that eliminate inventory access issues from the sales funnel are better able to position each channel in a way that most heavily capitalizes on its inherent advantage; customers no longer must worry about which channel is most likely to have a product in stock when they all draw from the same inventory.”

The objective with data integration is to create a single logical entity from many entities that may be outside of the direct control of the retailer (such as a channel), or integrating entities within a retail company (such as inventory, sales, fulfillment, etc.). The objective in a channel scenario is to present a single set of items that are for sale and can be directly fulfilled, no matter where the items, or the data about the items, exist. An example would be to tie Costco’s online inventory to its in-store inventory.

The best examples of well integrated channels are the many innovative retail Web sites out there that are only front-ends for hundreds, sometimes thousands of companies that sell goods. Amazon.com is the most obvious example. As the Web site user, you don’t have a clue, nor do you care, who is actually storing and shipping your product. That is, as long as it shows up on your doorstep ASAP.

In these days of more competition, retailers are becoming very creative about how they manage channels, including cost minimization through the use of distributed inventories and distributed fulfillment. Without a sound data integration strategy and technology, this approach would not be possible.

What I find interesting is that it took the retail vertical such a long time to figure this out. Back in the early days of data integration, it was always confounding to me that those in retail did not take the same or more interest in data integration technology as the other verticals, such as healthcare and finance. Indeed, the benefit is rather obvious, including the ability to increase sales and customer satisfaction.

Now that the data integration cat is out of the bag for those in retail, it’s up to them to figure out their own integration strategy for both their data, and their channels. The good news is, data integration tools are now better and more cost-effective. With the use of cloud-based platforms, even storage and management of the data is much less risky and expensive.

The path to well-integrated retail channels leads directly to more profit. That issue unto itself will drive most retailers to use data integration technology, or perhaps upgrade and rethink existing data integration approaches and mechanisms.

When data integration drives sales, then data integration becomes a priority. That’s certainly the case here.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.
Data Analytics

Maximize Customer Lifetime Value Using Historical Data

Actian Corporation

November 26, 2010

Will AI Take Data Analyst Jobs?

Acquiring new customers is expensive and it is generally more cost-effective to sell to existing ones. The customer-lifetime-value concept is straightforward, but many companies struggle with determining how to put it into practice.

Do you actually understand your customers’ buying behavior and motivations and what you can do to influence them? Is this information being channeled into your marketing and product development efforts to help you design and position your products and services that will best serve customers’ needs?

Turn Your Data into Actionable Insights

By measuring and maximizing current and forecasted customer value across products, segments and time periods, you will be better able to design new programs that accentuate your best customers and provide you with a distinct business advantage.

With Actian Vector, you can connect all your data, from account histories and demographics to mobile and social media interactions, and merge these disparate sources with speed and accuracy. You already have a wealth of data that can reveal insights into your customers’ motivations, all you need to do is put the puzzle together.

Use this information to uncover key purchase drivers and understand why someone purchases or rejects your products. Assign customer value scores by correlating which characteristics and behaviors lead to value at various points of time during the future. Optimize outbound marketing to give prominence to your high-value customers.

Customize inbound customer touchpoints by arming call centers with highly personalized customer data. This will all lead you to increase customer lifetime value, improving both customer loyalty and profitability.

Predicting Future Buying Behavior

Historical data won’t give you a crystal ball to peer into the future. Many factors can influence individual customer’s buying behavior and it is impossible to capture and analyze all of them. That doesn’t mean each transaction or customer interaction is unique and independent. Individual customers have preferences, buying behavior and social influences and use various types of environmental cues to determine what they will (and won’t) purchase.

Because humans are creatures of habit, past actions (often visible in historical data) are strong indicators of how customers will behave in the future. Similarly, different customers often demonstrate common behaviors and buying patterns when influenced by similar forces.

By analyzing the historical data of both individual customers and groups of similar customers, you can develop more accurate customer profiles, conduct micro-segmentation, identify sources of influence and model the actions your company can take not only to understand, but also to change customers’ buying behavior.

Obtain Better Insights With More Diverse Data

One of the biggest challenges companies face when analyzing customer data is leveraging data from diverse data sources. Using only one or a few data sources may provide you with multiple points-of-view, but it won’t reveal the holistic perspective of your customers you need to be actually successful.

Integrating more (and diverse) data into your analysis will help you eliminate blind spots, improve the accuracy of your findings and increase the end-results of your marketing efforts. Actian DataConnect is a platform that can help you gather your data-sources where the powerful analytics engine of Actian Vector can then distill them into actionable insights. To learn more, visit www.actian.com/vector.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale, streamlining complex data environments and accelerating the delivery of AI-ready data. The Actian data intelligence approach combines data discovery, metadata management, and federated governance to enable smarter data usage and enhance compliance. With intuitive self-service capabilities, business and technical users can find, understand, and trust data assets across cloud, hybrid, and on-premises environments. Actian delivers flexible data management solutions to 42 million users at Fortune 100 companies and other enterprises worldwide, while maintaining a 95% customer satisfaction score.