The More you Refine Stored Data, the More Valuable it Becomes By Pradeep Bhanot April 1, 2020 Having a lot of data available to you is a good thing, right? That depends. Data is a raw material, like a mineral mined from the ground. It contains a potential for creating value, but that potential is only realized through refinement. Your company produces a lot of data every day (every second really). Merely creating and/or possessing this data doesn’t mean it is generating value for you. Harvesting value from your company data requires a transformation process to convert data into information, to actionable insights, to decisions, and then to action. Data management professionals, IT staff, and business analysts are the people responsible for guiding data transformation. They employ a series of refinement steps to convert the raw materials that your operations generate into the meaningful and actionable insights that decision-makers across the company use to direct your staff, processes, and resources. Here is an overview of the refinement steps your data goes through, and the value addition that takes place along the way. Collection Data exists in your operations whether you collect it or not. The first step in data refinement is collection. This takes place within your operational systems, embedded sensors, and transactional workflows being executed throughout your company. Some data is collected in real-time through sensors, telemetry, and monitoring, while other data is collected periodically (perhaps hourly or at the end of the day). Data collection is all about measurement. The data management adage goes, “if you don’t measure it, you can’t manage it.” Extending this a step further, if you don’t collect the data, you can’t use it for decision making. Aggregation There are many data sources across your organization, and no single source contains all the information that is needed for effective decision making. Why is this?… because each data source provides a point of view on your operations. Using a single data source is like walking around a sports arena at night with only the light of a flashlight – you only see a very narrow view of your environment, not the big picture. Data aggregation brings the data from various sources together in one place, like illuminating a bunch of lightbulbs in that sports arena. Some data will overlap, so it can be filtered out, and there will be some gaps and shadows, but aggregation gets you one step closer to seeing the bigger picture of your operations. Reconciliation Once you have your data aggregated in one place, the next step in the refinement is reconciling the different data sets together to address gaps, overlaps, and conflicting information. This is also sometimes called data harmonization. A way to image this is considering the days before digital cameras when people took photos on film. To create a panoramic image, you took multiple pictures of adjacent scenes and then (after waiting for them to be developed) aligned the images by overlapping the frames into a panoramic view. Data reconciliation is similar, although considerably more complex. Some of the factors used in data reconciliation are data source, data quality when the data was captured (because you’re not viewing a still image, business data is a moving target). The result of data reconciliation is a unified data set that includes inputs from all your data sources. Categorization Categorization (often called cataloging) is the first step in understanding the content of your data. The purpose of categorization is to help you understand “what your data is.” Note, this is different from understanding “what your data means,” which is addressed in the next step. The best way to understand data categorization is a library full of books. Individual books represent different pieces of data. Librarians use a cataloging system (Dewey decimal system, the library of congress, etc.) to sort and organize books according to their content. In the business world, companies have data metamodels which provide the cataloging structure. Categorization is all about aligning operational data (from whatever source it was collected) to these metamodels, so like concepts (such as customer data) can be analyzed together. This is when data is transformed into information. Analysis Data analysis is all about understanding what your information means. Data and business professionals are summarizing, sorting, filtering, correlating, projecting, performing trend analysis to refine categorized information into meaningful and actionable insights about your business. It’s interesting that the data showed a specific process step took 2.385 seconds. It’s informative to know that the process measurement was the time it took to authorize a credit card transaction. But is that number good or bad? Is it relevant? Does it indicate something is wrong? Does someone need to initiate action because of it? Data analysis is the refinement step that converts information into insights about your business. Presentation Possessing data, information, and insights does not create value for your organization. Value comes from the decisions you make and the actions that result from interpreting the data. The last step in the data refinement process is taking the insights that you’ve generated and presenting them to decision-makers, system operators, and making them available for automation systems. Just as you aggregated data earlier in the process, this step involves disseminating, publishing, and visualizing the insights for consumption. The quality of data insights available for presentation is directly related to the effectiveness of your collection, aggregation, reconciliation, categorization, and analysis processes. Actian provides a set of data management capabilities to help your staff in orchestrating the refinement process – enabling not only these necessary steps but also the implementation of robust activities within these steps. Data becomes more valuable, the more you refine it. With the right tools, you will be able to develop better insights faster. This will lead to better decisions and greater value realization for your company. Actian’s Avalanche Hybrid Cloud Data Warehouse includes connectors to hundreds of data sources and functions to refine and transform raw data into information. You can learn more about Actian’s Real-Time Connected Data Warehouse at https://www.actian.com/solutions/connected-data-warehouse/ About Pradeep Bhanot Product Marketing professional, author, father and photographer. Born in Kenya. Lived in England through disco, punk and new romance eras. Moved to California just in time for grunge. Worked with Oracle databases at Oracle Corporation for 13 years. Database Administration for mainframe IBM DB2 and its predecessor SQL/DS at British Telecom and Watson Wyatt. Worked with IBM VSAM at CA Technologies and Serena Software. Microsoft SQL Server powered solutions from 1E and BDNA.