Data Analytics

Four Data Analytics Predictions for 2021

Actian Corporation

December 28, 2020

Data Analytics Prediction

In a year dominated by the dark cloud of the COVID-19 pandemic and the tragedies that ensued, we seem to be ending on a hopeful note. We saw the incredible power of human ingenuity, as not one but several vaccines have been developed at breakneck speed, surpassing even the most optimistic of expectations. We also saw how technology could be used to effect amazing transformation on a global scale.

With this positive and promising reminder, here are my four predictions for analytics in 2021:

  1. Data Analytics Will Fundamentally Transform the Supply Chain, Bringing Greater Visibility to Lead Times, Inventory Levels, and Logistics. We saw a classic case of a broken supply-and-demand chain at the onset of the COVID-19 outbreak in March. Demand for specific products surged, while supply plummeted due to unexpected manufacturing factory shutdowns, causing consumer panic, disruption, and delays. Leveraging analytics to look at real-time data for existing supply chain processes, distribution networks, and transportation solutions can help find pain points and opportunities, which in turn can proactively address supply chain points of vulnerability before issues arise. Harnessing data to understand delivery lead times, logistics scenarios, and inventory asset levels will drive greater levels of responsiveness, efficiency, and effectiveness across a broad spectrum of industries.
  2. Demand for Interoperable Multi-Cloud Platform Solutions Will Dramatically Increase. As SaaS tools and applications create further data fragmentation not just between existing on-premises data but across cloud-based operational workload data, the need for loosely coupled, cloud-based data ecosystems will emerge. Paradoxically, many organizations that have a “cloud-first” policy are seeing their costs rise over time due to increased consumption, inflexible deployment models, and lack of financial governance capabilities in cloud-based solutions. These “experienced organizations” will demand the ability to consume cloud services from many sources and the ability to combine data, leading to an unprecedented level of cost savings and new generations of solutions. For a modern data stack to work, it needs to be open to all origination sources, analysis, and visualization destinations.
  3. Technology Solutions That Can Deliver Real-Time Insights Will Be One of the Heroes of the Pandemic. The ability to gain real-time insights from federated but connected systems will enable organizations globally to respond to and gain control over the pandemic’s impact, whether that be for contact tracing, understanding infection rates, or vaccine distribution. But almost as important as saving lives and mitigating the spread of COVID-19 will be the need to rebuild the economy. The ability to rapidly assess changing market conditions will have to be fundamentally data-driven, following the same recipe of combining the right data from the right sources in real time.
  4. Container Technology That Has Played Such a Vital Role in Transforming the Data Center Will Also Move to the Edge, Bringing New Levels of Intelligence, Data Privacy, and the Next Generation of Services. Virtualization technologies and their ability to unlock the value of software on an increasingly intelligent converged infrastructure will move from the physical data center to the cloud, which in turn will lay the foundation for the new connected Edge. In 2021 expect to see hyper-converged infrastructure with container technology bring a new richness to software developed and deployed for mobile and IoT environments. We won’t see full monetization of 5G just yet, but these supporting technologies will give innovators and investors alike the confidence that the 5G wave is real and will be big.

At Actian, we have done our best to adapt to the unprecedented challenges of 2020. As we look ahead to the new year, we are excited to help our customers achieve new levels of innovation with our data management, integration, and analytics solutions.

Have a safe and successful 2021!

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Platform

Google GKE Containers: Technology for Cloud Data Warehouses

Actian Corporation

December 21, 2020

Google GKE

Actian on Google Kubernetes Engine

Google Kubernetes Engine (GKE) is now running containerized versions of Actian Data Platform (formerly known as Avalanche), designed to power an enterprise’s most demanding operational analytics workloads. Why is that important? Because now an enterprise can deploy one of the world’s most advanced data warehouse systems in about five minutes—a fraction of the time it takes to deploy in other cloud environments.

Let’s break this down a bit. The Google Kubernetes Engine—GKE—works with containers, which are, effectively, self-contained, pre-built component images. That’s important because in other, non-containerized, cloud environments data warehouses are typically deployed by running a series of scripts and/or REST API calls that build out each component from a base OS VM. In those scenarios, every component needs to be installed and configured—in sequence—so building out a complete cluster can easily take 25 minutes or more. That’s not a huge amount of time if you are expecting to set it and forget it, but in the age of DevOps there’s less and less setting and forgetting. The needs of a DevOps team change constantly, and in such a dynamic environment the need to reconfigure and redeploy—at 25 minutes per pop—can quickly become a source of real frustration. It’s also worth noting that a 25-minute projected deployment time assumes that everything runs without incident, and that may not always occur. The sheer number of operations that need to be executed to build these highly complex systems increases the possibility that something will not go as planned at some point in the process. There are lots of dots to connect, and each connection presents a point of vulnerability where something could go awry. The more you need to iterate, replicate, and expand deployments over time, the greater the likelihood that something is not going to go the way it should go and you’ll spend far more than 25 minutes trying to work out why.

Containers, in contrast, obviate the need to run these complicated setup procedures—because they have already been run and the dots connected when the containers were built. That’s right: it’s as though someone else ran through all the scripts and captured images of what a fully deployed Actian instance should look like—and then froze these images in a form that could be used and reused anywhere. Those pre-built images are the containers, and once built can be deployed quickly on Google Cloud via GKE.

In fact, it’s not even as complicated as deploying the containers via GKE. All an organization needs to do is select Google Cloud as the target when deploying an Actian cloud data warehouse. Actian invokes GKE to do the work of deploying the containers for you and within minutes you’re up and running with a world-class data warehouse.

Making the Most of the Google Cloud Infrastructure

That brings us to the second part of why it’s great to run Actian using GKE. Actian is designed to make optimal use of the compute resources at hand. The more CPU power and RAM one can configure in an Actian cluster, the more performance you’re going to experience. While that may be true for many systems, when it comes to the cloud, distinctly different infrastructures can be implemented. And while the question of which cloud vendor has the most performant infrastructure will vary from one investment cycle to the next, users of Google Cloud can take advantage of more readily available offerings with advanced, high-performance CPU/memory configurations than found on alternative platforms, and that can be crucial in certain business scenarios where speed-to-insight is critical. The whole physical infrastructure—not just the CPUs, but also the storage and network infrastructure on which GKE itself runs—enables Actian to take advantage of CPUs with larger on-chip cache and faster RAM, which it has been designed to leverage. This more innovative cloud infrastructure makes it easier to access more of the processing power than in other cloud offerings.

The containerized architecture that GKE is managing is important here: containers are largely agnostic when it comes to the underlying machine hardware, which means that a containerized deployment of Actian can easily take advantage of new hardware as it becomes available in Google Cloud. Conversely, an environment where Actian—or Snowflake or any other cloud data warehouse—is constructed without the benefits of containerization, will be more tightly tied to the architecture of the VM upon which the cluster components are running. Because an organization can easily subscribe to Google Cloud services that are configured to extract the highest performance achievable from the most current CPU and memory technologies, Google Cloud and GKE make it significantly easier to build a solution that will enable Actian to operate at peak performance.

Given the more optimal infrastructure provided by GKE in Google Cloud, it’s not surprising that provisional benchmarks conducted by Actian show Actian on Google Cloud delivering a 20% throughput improvement on average when compared to alternative cloud platforms. For those organizations looking for the data warehouse that delivers highest performance and throughput from the cloud, Actian on GKE presents a clear winning choice.

More Advantages Arising from Running Actian on Google Cloud

Does Actian gain other advantages from running on GKE? Yes, but we’ll flesh those out in part 2 of this blog. For teasers, though, let me say this: Anthos and security. We’ll say more about each in future discussions about Google Cloud and Actian. For now though, suffice it to say that there is an early adaptor program for Actian on Google that will enable you to kick Actian’s tires yourself and see how it can meet your pressing operational analytical workload needs more effectively than ever.

Give it a shot and see if you are moved by the power of Actian on GKE.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Analytics

The Need for Speed in Data Analysis

Emma McGrattan

December 16, 2020

GigaOm Blog

Why Performance Matters

My very first sales call to an analytics prospect was with a customer using an OLTP database for reporting. Some of their reports took hours to run and one particularly challenging report typically ran for more than thirteen hours. Our performance benchmarking had shown that for analytics workloads, Actian would be up to 100 times faster than an OLTP database and I was anxious to see if that would hold true in the real world. Indeed, we were able to take the 13-hour report and complete it in less than twenty seconds.

It felt like a punch to the gut when the customer informed us that speed did not matter to him because that particular report was run on Saturday nights and so long as it was completed by Monday at 9 am, it didn’t matter how long it took to run. That is when I first realized that for performance to really matter, it has to deliver business value.

For this prospect, that weekly report was parceled out to a team of representatives who spent their week working off aging information and wasting time visiting accounts where the status had changed. By taking advantage of the performance that Actian could deliver and switching to real-time access to their customer information, they were able to eliminate wasted customer visits and grow revenue by 12% in the first year after adopting Actian.

Fast forward to today, data sets are exponentially more extensive, and many industries are reliant on fast analytics performance to achieve improvements in business outcomes, remain competitive, and add value to their customers. Here are a few examples of how Actian’s continued performance gains help its clients improve their data analytics capabilities.

Performance Matters in Healthcare

Equian uses Actian for healthcare claims processing because it enables them to process claims much faster than any other technology they have evaluated. Equian is paid a fee for each claim they process, and the more claims they process, the more money they make. The fees they receive for processing claims are dependent on the age of the claim, so this is a prime example of an instance where Actian’s performance delivers directly to the bottom line. But it isn’t always so obvious.

Performance Matters for Insurance Providers

The UK’s largest motoring organization, The AA, uses Actian technology to quickly generate accurate car insurance quotes.  They provide quotes for comparative car insurance websites, and the faster they can return a quote, the more prominent position they get on the page of competitive quotes.  Generating the quote quickly is obviously important, but generating an accurate quote, enriched by myriad other data sources, is just as important to the insurance underwriters – they need to be sure that the quote accurately represents the insurable risk. For example, if the car that you’re requesting the quote for has been in a number of accidents, that needs to be factored into the quote, as does the driver’s driving record and the demographics of the neighborhood where the car will be parked; it’s only when all of this information has been factored in that the quote can be provided to the customer and typically the AA wants to complete the entire quoting process in a matter of a second or two.

Performance Matters in Financial Services

In the financial services world, time is money; shaving a millisecond off the time it takes to analyze data can mean millions made or lost. Refinitiv, now part of the London Stock Exchange, built their Eikon analytics platform on Actian technology, and they set an SLA of 20ms for completing analytics queries. Refinitiv’s analytics platform provides its customers with the data and analysis needed to make trading decisions faster than customers of their competitors. In the rapidly changing financial markets, time is money. As a result, people put tremendous value on a performance advantage.

Performance You Can Try for Yourself

The recently released GigaOm Cloud Data Warehouse benchmark report clearly demonstrates the Actian Data Platform, formerly Avalanche, performance superiority over the competition with a typical decision support workload comprised of a mix of complex queries, representative of the types of queries we encounter in every customer, no matter what industry they are in.

actian outperforms snowflake

Scalable to Meet Your Performance Demands

While many of our competitors charge a premium for the type of performance that Actian Data Platform delivers, we pride ourselves on the value our platform delivers.   Not only do we deliver the best value per query, but we also keep your costs low by enabling you to scale the compute environment to meet your ever-changing business needs. Additionally, we identify when the system has been inactive for a specified period and will shut it down to stop the meter from running. Gone are the days of purchasing systems to match your peak workload and hoping that your gut feeling regarding anticipated growth was accurate. You can grow and shrink your Actian Data Platform environment to meet the needs of the business.

actian more cost-effective than snowflake

I’m very proud of the technology that my team has built and of the results that the researchers at GigaOm were able to achieve when benchmarking Actian Data Platform relative to our competitors. I’d love to hear your stories of how performance translated into business value for you.

emma mcgrattan blog

About Emma McGrattan

Emma McGrattan is CTO at Actian, leading global R&D in high-performance analytics, data management, and integration. With over two decades at Actian, Emma holds multiple patents in data technologies and has been instrumental in driving innovation for mission-critical applications. She is a recognized authority, frequently speaking at industry conferences like Strata Data, and she's published technical papers on modern analytics. In her Actian blog posts, Emma tackles performance optimization, hybrid cloud architectures, and advanced analytics strategies. Explore her top articles to unlock data-driven success.
Data Analytics

Innovation and the Power of Data

Actian Corporation

December 15, 2020

Innovation

What is Innovation?

Innovation is doing something new that does not currently exist or something new that improves efficiency, effectiveness, cost, and/or creates a competitive advantage. A new method, process, product, or service can be considered an innovation. This also includes the integration of different markets with the utilization of a new product that services both markets. An example of this is the cellular phone market, which has leveraged and integrated other markets such as photography and electronic payments on one mobile device. These innovations have created a new mobile device market, making the cellular phone market almost obsolete.

There are many other examples of innovation in many start-up companies. Some companies have leveraged cloud computing to deliver products or services without having traditional on-premises IT infrastructure. Others like Uber, Getaround, and Turo have innovative business models in use such that they do not own any of the major assets required to run their business, such as vehicles.

The opportunities for innovation are everywhere and in everything. The caution is that a consumer may not need every innovation. Someone can have the world’s most creative idea for innovation of something, yet it may never be adopted. There are also barriers to entry into various markets that can be obstructed by many constraints and challenges.

Consumers and businesses need specific product and service benefits to accomplish their outcomes and may want products and services that are not necessary. Needs and wants are fulfilled through market offerings in which constraints are managed.

Maturity and Innovation

It is hard for an organization to mature and innovate if the organization uses most of its resources for firefighting, including data analytics. The organization has to have some stability in the markets that it serves. The products and services also have to provide the basic needs and wants of its customers; otherwise, the company may end up just trying to keep up with its competitors instead of getting ahead of its competitors.

To innovate, organizations must have strategic intent on innovation and invest budget for innovation-related activities, including the data acquisition and analytics needed. The challenge is, where should the organization invest? The question is, should it be in new products, services, processes, or in other areas? The answer cannot be found just by simply guessing.

Organizations should always do strength, weakness, opportunity, and threat (SWOT) analysis at strategic, tactical, and operational levels. It is not good enough to only do this analysis at a strategic level in the organization. It is also a mistake to do this analysis at any level without external market data to back up findings.

Organizations may find that they need to evolve existing products, solutions, or practices; or completely throw out and transform how they currently do things or deliver products to their customers. Survival of the fittest can be defined by the ability to innovate rapidly.

Data Trending to the Next Innovation

An enterprise data-driven strategy can help with next-generation innovation in your organization. Collected data can help with accurate decision-making for innovation initiatives. Trends can be easily identified, and actionable decisions can be made.

Here are some of the steps:

  • Collect and pay attention to organizational metrics in all areas for continuous improvement.
  • Define market spaces and adjacent market spaces – The adjacent spaces give insight into possible market integrations and innovation possibilities.
  • Categorize the data based on market and services, then further based on each capability in the service or product.
  • Measure, collect, and integrate the data with the intent to innovate, not just for organizational health.

Here are some questions to ask that need data support:

  • What is the difference between the benefit that a consumer gains for using a product or service and the price?
  • Can the markets that the organization serves be integrated and simplified for the consumer or the organization?

And here are data decisions:

  • Anticipate demand for a new product or service.
  • Set measure success criteria.
  • Look at data trends across the organization.
  • Note the constraints to performance, efficiency, effectiveness in each area.
  • Create service-oriented and customer-oriented integrated data maps to other supporting data.
  • Tie the trending data into organizational value knowledge for strategic, tactical, and operational innovations, including budgeting decisions.

Empowering Data for Innovation

The most reliable way to make innovation decisions is to base the decision on enterprise-wide integrated data. Actian DataConnect makes it easy to connect operational data from any data source and transform it to facilitate effective data analysis. DataConnect makes it easy for data-driven enterprises to design, deploy, and manage data flows across on-premises, cloud, and hybrid environments. Bringing together internal operational data, customer requirements, and external market data innovation investments can be funded and assessed more effectively.

Meta description: Actian DataConnect makes it easy to connect operational data from any data source and transform it to facilitate practical data analysis to drive innovation.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Integration

System Thinking to Run, Grow, and Transform the Business

Actian Corporation

December 11, 2020

System thinking stock image

Systems Thinking and the Organization

There are many ways to describe system thinking. We can say system thinking is a way of viewing or thinking about a system in terms of structure, patterns, cycles, and data exchanges. System thinking can also be thought of as a way of integrating a person into the organizational flows of the system or the way the organization works to make decisions. Then there is system-1 and system-2 thinking. System 1 is based on one’s experiences. System 2 is based on analytics or data. There have been studies to categorize system-1 as fast thinking and system-2 as slow thinking.

Organizations usually have an intended strategy that is budgeted to run, grow, or transform the business. The Run the Business strategy usually is aimed at fixing issues, customer wishes, and keeping the light on. The Grow strategy is usually aimed at creating competitive capabilities. The transformation of the business strategy is aimed at innovation.

The strategy is transformed into tactics and tactics into operations. Another model to consider is that strategy transforms into missions, missions to goals, and goals to objectives. Each organizational structure and person in the organization supports the objectives with projects and daily activities.

Organizational structure influences the behavior of the organization. There are different types of structures, such as hierarchical, which can include functional, divisional, and horizontal structures. There are matrix organizational structures and other types. Each structure supports behavior, data exchanges, and communication of the people in the particular roles in the structure.

Within these structures are many complexities and challenges. There is a value chain of interactions, linking the top strategy to the bottom objectives. These objectives to meet the goals have critical success factors. Each critical success factor is measured based on a system 1 approach, or a system 2 approach, or a combination of system 1 and 2.

System 1, expert opinion can be good enough for some objectives. System 2, a metric can be good enough for some objectives. The combination of both is always best.

Enablement of Decision and Data Exchanges

 Data is exchanged between people and technology for accomplishing work efforts. The person or automated system transforms data for consumption. Data effectiveness strives to be done so that there are no data silos that could affect the health of the system. Work efforts across the organization need to be focused on the organization’s strategy. Data management tools should support the elimination of data silos, such as supporting the integration of data in the cloud and on-premises solutions.

Decisions are enabled with the transformation of data to information, information to knowledge, and knowledge to decisions. A decision can be made without data, such as the discussion on system-1 thinking. A decision can be made with data alone, such as with system-2 thinking. Enterprise data systems help connect the data value chains between the organization structure types with people for decision support.

Many organizations have been concerned with what is called the “graying” of the IT community—in other words, losing key decision capabilities that are in the minds of the employee who has years of experience in a particular area or from working for the company. Many of these people are sometimes hired back as consultants because of the knowledge they have. The empowerment of a data-driven enterprise system helps collect people’s knowledge and discovered knowledge for the benefit and system thinking of the organization. Important decision-making data does not get lost.

Run, Grow and Transform the Business

Day-to-day analytics captured for running the business can help experts in key customer-facing functions or areas make faster and more accurate decisions. Enterprise data analytics can be used to discover business constraints and change the business trajectory for continuous growth. Business transformation requires knowledge from a system 1 and system 2 perspective.

Enterprise data to support an overall service knowledge management system in your organization for agile, quick, empowered, trusted, and high performing decisions can only be enabled with technology. Each function and many people in the same functional structure in the organization uses different tools and processes to do their job within their service value chains. The organization’s data is the organization’s data and should be leveraged appropriately across the enterprise. To do this effectively requires the collection of analytical data from as many sources as possible, then transforming this data to appropriate information and knowledge for decision support across the organization. Using a solution like Actian DataConnect enables quick and easy design, deployment, and management across on-premises, cloud, and/or hybrid environments.

An organization that functions as a one-team with many unique, specialized capabilities and responsibilities should be enabled with analytical system 2 data to support their system 1 experience and expertise for high-performing organizational decision support. Customer insights, organizational performance, and many other valuable decisions can be made for effective and efficient running, growing, or transforming the business with a system thinking approach. Actian DataConnect enables rapid onboarding and delivers rapid time to value, and allows you to connect to virtually any data source, format, location, using any protocol. Learn more here.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Marquez: The Metadata Discovery Solution at WeWork

Actian Corporation

December 10, 2020

Marquez v2 EN

Created in 2010, WeWork is a global office and workspace leasing company. Their objective is to provide space for teams of any size including startups, SMEs, and major corporations, to collaborate. To achieve this, what WeWork provides can be broken down into three different categories:

  • Space: To ensure companies with optimal space, WeWork must provide the appropriate infrastructure, which consists of booking rooms for interviews / one on ones or even entire buildings for huge corporations. They also must make sure they are equipped with the appropriate facilities such as kitchens for lunch and coffee breaks, bathrooms, etc.
  • Community: Via WeWork’s internal application, the firm enables WeWork members to connect with one another, whether it’s local within their own WeWork space, or globally. For example, if a company is in need of feedback for a project from specific job titles (such as a developer or UX designer), they can directly ask for feedback and suggestions via the application to any member, regardless of their location.
  • Services: WeWork also provides their members with full IT services if there are any problems as well as other services such as payroll services, utility services, etc

In 2020, WeWork represents:

  • More than 600,000 memberships.
  • Locations in 127 cities in 33 different countries.
  • 850 offices worldwide.
  • Generated $1.82 billion in revenue.

It is clear that WeWork works with all sorts of data from their staff and customers, whether that be individuals or companies. The huge firm was therefore in need of a platform where their data experts could view, collect, aggregate, and visualize their data ecosystem’s metadata. This was resolved by the creation of Marquez.

This article will focus on WeWork’s implementation of Marquez mainly through free & accessible documentation provided on various websites, to illustrate the importance of having an enterprise-wide metadata platform in order to truly become data-driven.  

Why Manage and Utilize Metadata?

In his talk “A Metadata Service for Data Abstraction, Data Lineage & Event-Based Triggers” at the Data Council back in 2018, Willy Lulciuc, Software Engineer for the Marquez project at WeWork explained that metadata is crucial for three reasons:

  • Ensuring Data Quality: When data has no context, it is hard for data citizens to trust their data assets: are there fields missing? Is the documentation up to date? Who is the data owner and are they still the owner? These questions are answered through the use of metadata.
  • Understanding Data Lineage: Knowing your data’s origins and transformations are key to being able to truly know what stages your data went through over time.
  • Democratization of Datasets: According to Willy Lulciuc, democratizing data in the enterprise is critical! Having a central portal or UI available for users to be able to search for and explore their datasets is one of the most important ways companies can truly create a self-service data culture.

To sum up: creating a healthy data ecosystem. Willy explains that being able to manage and utilize metadata creates a sustainable data culture where individuals no longer need to ask for help to find and work with the data they need. In his slide, he goes through three different categories that make up a healthy data ecosystem:

  1. Being a self service ecosystem, where data and business users have the possibility to discover the data and metadata they need, and explore the enterprise’s data assets when they don’t know exactly what they are searching for. Providing data with context, gives the ability to all users and data citizens to effectively work on their data use cases.
  2. Being self-sufficient by enabling data users the freedom to experiment with their datasets as well as having the flexibility to work on every aspect of their datasets whether they input or output datasets for example.
  3. And finally, instead of relying on certain individuals or groups, a healthy data ecosystem allows for all employees to be accountable for their own data. Each user has the responsibility to know their data, their costs (is this data producing enough value?) as well as keeping track of their data’s documentation in order to build trust around their datasets.

Room Booking Pipeline Before

As mentioned above, utilizing metadata is crucial for data users to be able to find the data they need. In his presentation, Willy shared a real situation to prove metadata is essential: WeWork’s data pipeline for booking a room.

For a “WeWorker”, the steps are as follows:

  1. Find a location (the example was a building complex in San Francisco).
  2. Choose the appropriate room size (usually split into the number of attendees – in this case they chose a room that could greet 1 – 4 people).
  3. Choose the date for when the booking will take place.
  4. Decide on the time slot the room is booked for as well as the duration of the meeting.
  5. Confirm the booking.

Now that we have an example of how their booking pipeline works, Willy proceeds to demonstrate how a typical data team would operate when wanting to pull out data on WeWork’s bookings. In this case, the example exercise was to find the building that held the most room bookings, and extract that data to send over to management. The steps he stated were the following:

  • Read the room bookings from a data source (usually unknown).
  • Sum up all of the room bookings and return the top locations.
  • Once the top location is calculated, the next step is to write it into some output data source.
  • Run the job once a hour.
  • Process the data through .csv files and store it somewhere.

However, Willy stated that even though these steps seem like it’s going to be good enough, usually, there are problems that occur. He goes over three types of issues during the job process:

  1. Where can I find the job input’s dataset?
  2. Does the dataset have an owner? Who is it?
  3. How often is the dataset updated?

Most of these questions are difficult to answer and jobs end up failing. Without being sure and trusting this information, it can be hard to present numbers to management. These sorts of problems and issues are what made WeWork develop Marquez.

What is Marquez?

Willy defines the platform as an “open-sourced solution for the aggregation, collection, and visualization of metadata of [WeWork’s] data ecosystem”. Indeed, Marquez is a modular system and was designed as a highly scalable, highly extensible platform-agnostic solution for metadata management. It consists of the following components:

  • Metadata Repository: Stores all job and dataset metadata, including a complete history of job runs and job-level statistics (i.e. total runs, average runtimes, success/failures, etc).
  • Metadata API: RESTful API enabling a diverse set of clients to begin collecting metadata around dataset production and consumption.
  • Metadata UI: Used for dataset discovery, connecting multiple datasets and exploring their dependency graph.

Marquez’s Design

Marquez provides language-specific clients that implement the Metadata API. This enables a  diverse set of data processing applications to build a metadata collection. In their initial release, they provided support for both Java and Python.

The Metadata API extracts information around the production and consumption of datasets. It’s a stateless layer responsible for specifying both metadata persistence and aggregation. The API allows clients to collect and/or obtain dataset information to/from the Metadata Repository.

Metadata needs to be collected, organized, and stored in a way to allow for rich exploratory queries via the Metadata UI. The Metadata Repository serves as a catalog of dataset information encapsulated and cleanly abstracted away by the Metadata API.

According to Willy, what makes a very strong data ecosystem is the ability to search for information and datasets. Datasets in Marquez are indexed and ranked through the use of a search engine based keyword or phrase as well as the documentation of a dataset: the more a dataset has context, the more it is likely to appear first in the search results. Examples of a dataset’s documentation is its description, owner, schema, tag, etc.

You can see more detail of Marquez’s data model in the presentation itself here: https://www.youtube.com/watch?v=dRaRKob-lRQ&ab_channel=DataCouncil

The Future of Data Management at WeWork

Two years after the project, Marquez has proven to be a big help for the giant leasing firm. They’re long term roadmap is to solely focus on their solution’s UI, by including more visualizations and graphical representations in order to provide simpler and more fun ways for users to interact with their data.

They also provide various online communities via their Github page, as well as groups on LinkedIn for those who are interested in Marquez to ask questions, get advice or even report issues on the current Marquez version.

Sources

A Metadata Service for Data Abstraction, Data Lineage & Event-Based Triggers, WeWork. Youtube: https://www.youtube.com/watch?v=dRaRKob-lRQ&ab_channel=DataCouncil

29 Stunning WeWork Statistics – The New Era Of Coworking, TechJury.com: https://techjury.net/blog/wework-statistics/

Marquez: Collect, aggregate, and visualize a data ecosystem’s metadata, https://marquezproject.github.io/marquez/

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Analytics

Turning Data Measurements, Metrics, and Performance Indicators into Results

Actian Corporation

December 1, 2020

Data Measurements

Data is everywhere. There is data inside our organizations and outside. Some data is in the form of measurements. Measurements can be descriptive, predictive, or for diagnostic reasons. In either case, we measure to make decisions based on the quantity and quality of something. Measurements can give us actionable operational, tactical, and strategic metrics for influencing human and artificial decisions. Organizational experts at every level, besides just using their opinionated industry expertise, also rely on data to enhance their decisions.

This leads to challenges with what data to collect and how to use it effectively for specific outcomes. Collecting data is the easy part; integrating and orchestrating data collaboration across a value chain is the hard part. Every functional unit uses different tools, automation, and manual interfaces for performing their job across a chain of interconnected activities for producing a service or product for the organization. In many cases, data analytics are siloed within a function or require manual people-oriented exchanges between functions in the organization. Data needs to be integrated, without being constrained by organizational boundaries for people, processes, and technologies to effectively harmonize.

If data, information, and knowledge interchanges are not done with strategic intent, we risk ineffective organizational collaboration and poor use of our assets. This can cause challenges with decision-making in an organization at every level. Data and information do not move in one direction across an organization’s value chain of activities, but there should also be a feedback mechanism through the use of data exchanges to help organizations be more agile and precise in how they use and interpret information for strategic, tactical, and operational intents.

Data Transformation

Data transforms into information, information into knowledge, and knowledge into decisions. This is the DIKW model. Information systems need integrations at various levels, across various tools and technologies to enable informed, precise decisions across the organization.

When transforming data, consideration needs to be given to how to transform measurements into metrics and metrics into key performance indicators. Key performance indicators (KPIs) take data metrics and help an organization focus on what matters the most. The KPIs should be related to critical success factors (CSFs) for each organizational objective or project. Each organizational objective should relate to the strategic intent and investment strategy of the organization. As these data elements are connected across the organization, visibility from strategy, and tactics to operations can be achieved.

Fully integrated and automated heterogeneous systems expedite data exchanges, workflows, and decisions for people, including artificial intelligence-enabled technologies. This helps all business processes perform better and improves forward and rearward visibility for agility.

Data and Business Strategy, Tactics and Operations

Some business strategy concerns are usually related to decisions that affect the return on investment (ROI), value of investment (VOI), and total cost of ownership (TCO) of the organization’s capabilities and resources for delivering and supporting the portfolio of products and services to the market. These financial concerns affect the performance of the entire organization, including influencing the budget for innovation, providing customer-requested enhancements, customer fixes, and competitive features. Each of these areas has strategic intent and receives a portion of the budget for execution.

Most organizations decide business strategy investments for a year and then review their decisions at year-end for modification of the next year’s decisions. Although the organization may have a multi-year vision and mission established, this is usually the case, especially for managing the budget investments.

Using agile tactical and operational feedback across their service value chains can modify or shift the budget spend quickly based on data feedback integrated into their business systems, quickly affecting the top line and the bottom line in their organizations. Daily monitoring, watching trends, environmental issues, the success of tactics, and production of operations across the organization is enabled with organizational-wide integrated data, information, and knowledge.

Improving Business Outcomes

In today’s highly competitive environment, decisions must be made quickly to improve the long-term viability of the business. Information superiority for competitiveness is a necessity. Actian DataConnect simplifies data integration across the organization. Offering them the strategic ability to answer the following questions and more using data-rich metrics instead of just people expertise.

  • Where are we now?
  • Where do we want to be?
  • How do we get there?
  • Was the change effective?
  • How do we measure our progress?
  • Are resources and capabilities being used effectively and efficiently?
  • Are there any constraints?
  • Is the current strategy and tactics effective?
  • Are operations working effectively, efficiently, and economically?

Technology, data integration, people collaboration, and communication go hand in hand. To improve results and overall business outcomes, organizations must work as one, sharing data and information seamlessly to support strategic intent, programs, projects, and overall successful operational decisions. Enterprise data integration improves the business overall, including customer expectations and experiences.

Actian DataConnect provides the technology platform you need to achieve your Enterprise Data Integration objectives. Through a highly scalable hybrid deployment model, robust integration design capabilities, and automated deployment capabilities – DataConnect can help you deliver more effectively and faster than other solutions.  To learn more, visit www.actian.com/data-integration/dataconnect/.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Leader

Our 2020 Cloud Data Migration Survey Results Are In

Actian Corporation

November 12, 2020

Best Practices in the Cloud

Why is This Study Different?

OK, so that’s like saying the sky is blue on a clear sunny day, right? Actually, it’s cloudier than that – pun intended! If you talk to cloud vendors, everything is moving to the cloud and has been for over a decade, so the glass is half full. If you talk with legacy platform vendors with a vested interest in remaining on-premises, the cloud will never fully eclipse on-premises, so the glass will always be half empty. Truth be told, they are probably both right: The glass can hold 16 ounces and it currently contains 8 ounces. The problem with this more factual statement that bridges the two perspectives is that it overly simplifies what is a very complex set of journeys that most organizations take as they move to the cloud.

Most surveys addressing migration tend to be done by cloud vendors, SIs with cloud practices, or cloud-centric analysts. Their questions tend to be crafted to elicit responses on if the respondent is moving to the cloud but do not address how much, what is being moved, and how difficult the move has been. Our experiences with customers tell us that there are probably many different experiences of moving to the cloud – even within a single organization, based on an array of factors. Further, we couldn’t find surveys specific to data migration that are freely available and do not prescribe a specific product or methodology for migration. Our customers were constantly asking us for more than just a reference to a single customer we’ve migrated to the cloud.

Detailed Feedback on Data Migration From Your Peers

To this end, we decided to sponsor annual surveys that use an external third-party firm to conduct phone interviews with an extensive set of questions, tabulate results, and provide key observations. The battery of questions focuses on data migration, not applications or underlying infrastructure which seems to be where the bulk of such surveys focus. The firm recently completed this year’s survey of hundreds of respondents from two equal and distinct groups: IT Data Managers (ITDMs), including Enterprise Architects, Data Engineers, and Database Administrators; and key data decision-makers, including business analysts, line-of-business users, and data scientists.  The survey report goes beyond just tabulating results to each of the questions and includes excerpts from interviews that highlight points we believe the report reader will find are gems of wisdom if they too are on a cloud journey.

There Are Ten Key Topics and Areas Covered

  1. What were the key drivers for you to move your data to the cloud?
  2. Where data in organizations data resides – on-premise, single cloud, multiple clouds and what percent of that data is in the cloud?
  3. For data you haven’t moved to the cloud, why does it need to remain on-premise?
  4. How much data do most organizations have in their enterprise data warehouse?
  5. How easy or hard was it to migrate to the cloud. Were you prepared to move. Did it go as you expected it to?
  6. What have you learned? What would you pay closer attention to, resource or do differently in your next cloud data migration?
  7. What have you seen as outcomes, what impact has this had on your job and services to your stakeholders?
  8. What types of operations and analysis are you using your data, for?
  9. What challenges are involved in using your data?
  10. How well are ITDMs and data decision makers working together?

Top Five Observations

  1. Investments in cloud have a positive impact on the business. Surveyed data decision makers find that having their data in the cloud provides real-time data access and enables them to get better insights faster.
  2. Even as investments in cloud increase and prove their value, the need for on-premise hasn’t gone away and often, a single cloud environment isn’t possible. Net-net, Hybrid landscapes are unavoidable, and are in fact necessary for larger organizations.
  3. Given the need to maintain a multi-cloud and on-premise environments for both applications and data, the journey to the cloud is proving to be more complex due to several factors highlighted in the report.
  4. ITDMs surveyed have learned important lessons around preparation, resourcing, and training that are described in more detail in the survey.
  5. From a data decision maker perspective, more understanding and support is needed from ITDMs in order to work together more effectively. This is particularly true when it comes to having secure access to the data they need in a timely manner.
actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

Machine Learning Data Catalogs: Good but not Good Enough

Actian Corporation

November 10, 2020

machine-learning-data-catalog

How can you Benefit From a Machine Learning Data Catalog?

You can use Machine Learning Data Catalogs (MLDCs) to interpret data, accelerate the use of data in your organization, and link data to business results.

We provide real-world examples of the smart features of a data catalog in our previous articles:

In this document, they cite the Actian Data Intelligence Platform Data Catalog as one of the key Machine Learning Data Catalog vendors on the market! However, as data professionals, you are aware that the “intelligent” aspect of a data catalog is a good solution, but not enough for you to achieve your data democratization mission.

Machine Learning Data Catalog vs. Smart Data Catalogs: What’s the Difference?

The term “smart data catalog” has become a buzzword over the past few months. However, when referring to something being “smart,” most people automatically think, and rightly so, of a data catalog with only Machine Learning capabilities.

We do not believe that a smart data catalog is reduced to only having ML features. There are different ways to be “smart”. We like to refer to machine learning as an aspect, among others, of a Smart Data Catalog.

The 5 pillars of a smart data catalog can be found in its :

  • Design: The way users explore the catalog and consume information.
  • User Experience: How it adapts to different user profiles.
  • Inventory: Provides an intelligent and automatic way to inventory.
  • Search Engine: Meets different expectations and gives intelligent suggestions.
  • Metadata Management: A catalog that marks up and links data together using ML features.

This conviction is detailed in our article: “A smart data catalog, a must-have for data leaders” which was also given at the Data Innovation 2020 by Guillaume Bodet.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Actian Life

Actian Acknowledged as One of the Top Workplaces in Austin

Actian Corporation

November 9, 2020

Austin, TX Actian Office

When I moved to Austin 10 years ago from the California Bay Area, little did I not know that I would be working in a company that gets acknowledged as the Top Workplaces in Austin today. This accolade was nominated and selected by the employees of the company.

Actian Corporation came about as a consolidation of several companies worldwide from 2010 to 2013. In 2016, the company heralded a new leadership team that was very employee-friendly and took various steps that benefited all of us in Austin.

One of the first steps was to treat all employees in the US equally – irrespective of which constituent company they came from. The second step was establishing a modern office and workplace which gave everyone a reason to get up and come to work every day feeling happy. Friendly people at the workplace made it even more exciting and the office was a welcome location not just to work, but to enjoy the cafeteria and socialize. There were areas in the office that greatly helped us in collaboration.

When Covid-19 struck us earlier this year, we all started to work from home. This was a great challenge for the company in general and the Austin office in particular. At the company level various teams were formed to ensure that we had a uniform corporate direction globally, while we followed local laws and were able to take decisions locally. All of us missed our workplace. Locally in Austin we organized regular team get-togethers virtually where we would share work-from-home (WFH) challenges and the support needed. Many employees said that they “felt heard” and all “appreciated the help that the IT team put in place” procuring additional monitors to help make WFH experience better. Virtual coffee meets and virtual happy hours on Friday helped keep the community connected and socialize. The CEO of the company started weekly all hands to ensure that everyone knew what was going on. Leadership was in full display and all employees felt cared.

As the early steps from 2016 to 2019 made everyone experience a great place to work, recent events in 2020 and the support from the company has made this a Top Workplace. Check out our Press Release.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Intelligence

What is a Knowledge Graph and How Does it Enhance Data Catalogs?

Actian Corporation

November 4, 2020

Visual of a knowledge graph for data catalogs

Knowledge graphs have been interacting with us for quite some time. Whether it be through personalized shopping experiences via online recommendations on websites such as Amazon, Zalando, or through our favorite search engine Google.

However, this concept is still often a challenge for most data and analytics managers who struggle to aggregate and link their business assets in order to take advantage of them as do these web giants.

To support this claim, Gartner stated in their article “How to Build Knowledge Graphs That Enable AI-Driven Enterprise Applications” that:

“Data and analytics leaders are encountering increased hype around knowledge graphs, but struggle to find meaningful use cases that can secure business buy-in.”.

In this article, we will define the concept of a knowledge graph by illustrating it with the example of Google and highlighting how it can empower a data catalog.

What is a Knowledge Graph Exactly?

According to GitHub, a knowledge graph is a type of ontology that depicts knowledge in terms of entities and their relationships in a dynamic and data-driven way. Contrary to static ontologies, who are very hard to maintain.

Here are other definitions of a knowledge graph by various experts: 

  • A “means of storing and using data, which allows people and machines to better tap into the connections in their datasets.” (Datanami)
  • A “database which stores information in a graphical format – and, importantly, can be used to generate a graphical representation of the relationships between any of its data points.” (Forbes)
  • “Encyclopedias of the Semantic World.” (Forbes)

Through machine learning algorithms, it provides structure for all your data and enables the creation of multilateral relations throughout your data sources. The fluidity of this structure grows more as new data is introduced, allowing more relations to be created and more context to be added which helps your data teams to make informed decisions with connections you may have never found.

The idea of a knowledge graph is to build a network of objects, and more importantly, create semantic or functional relationships between the different assets. 

Within a data catalog, a knowledge graph is therefore what represents different concepts and what links objects together through semantic or static links.

Google Example:

Google’s algorithm uses this system to gather and provide end users with information relevant to their queries.

Google’s knowledge graph contains more than 500 million objects, as well as more than 3.5 billion facts about and relationships between these different objects.

Their knowledge graph enhances Google Search in three main ways:

  • Find the right thing: Search not only based on keywords but on their meanings.
  • Get the best summary: Collect the most relevant information from various sources based on the intent.
  • Go deeper and broader: Discover more than you expected thanks to relevant suggestions.

How do Knowledge Graphs Empower Data Catalog Usages?

Powered by a data catalog, knowledge graphs can benefit your enterprise in their data strategy through:

Rich and In-Depth Search Results

Today, many search engines use multiple knowledge graphs in order to go beyond basic keyword-based searching. Knowledge graphs allow these search engines to understand concepts, entities and the relationships between them. Benefits include:

  • The ability to provide deeper and more relevant results, including facts and relationships, rather than just documents.
  • The ability to form searches as questions or sentences — rather than a list of words.
  • The ability to understand complex searches that refer to knowledge found in multiple items using the relationships defined in the graph.

Optimized Data Discovery

Enterprise data moves from one location to another in the speed of light, and is being stored in various data sources and storage applications. Employees and partners are accessing this data from anywhere and anytime, so identifying, locating and classifying your data in order to protect it and gain insights from it should be the priority.

The benefits of knowledge graphs for data discovery include:

  • A better understanding of enterprise data, where it is, who can access it and where, and how it will be transmitted.
  • Automatic data classification based on context.
  • Risk management and regulatory compliance.
  • Complete data visibility.
  • Identification, classification, and tracking of sensitive data.
  • The ability to apply protective controls to data in real time based on predefined policies and contextual factors.
  • Adequately assess the full data picture.

On one hand it helps implement the appropriate security measures to prevent the loss of sensitive data and avoid devastating financial and reputational consequences for the enterprise. On the other, it enables teams to dig deeper into the data context to identify the specific items that reveal the answers and find ways to answer your questions.

Smart Recommendations

As mentioned in the introduction, recommendation services are now a familiar component of many online stores, personal assistants and digital platforms.

The recommendations need to take a content-based approach. Within a data catalog, machine learning capabilities combined with a knowledge graph,  will be able to detect certain types of data, apply tags, or statistical rules on data to run effective and smart asset suggestions.

This capacity is also known as data pattern recognition. It refers to being able to identify similar assets and rely on statistical algorithms and ML capabilities that are derived from other pattern recognition systems.

This data pattern recognition system helps data stewards maintain their metadata management:

  • Identify duplicates and copy metadata
  • Detect logical data types (emails, city, addresses, and so on)
  • Suggest attribute values (recognize documentation patterns to apply to a similar object or a new one)
  • Suggest links – semantic or lineage links
  • Detect potential errors to help improve the catalog’s quality and relevance

The idea is to use some techniques that are derived from content-based recommendations found in general-purpose catalogs. When the user has found something, the catalog will suggest alternatives based both on their profile and pattern recognition. 

Some Data Catalog Use Cases Empowered by Knowledge Graphs

  • Gathering assets that have been used or related to causes of failure in digital projects.
  • Finding assets with particular interests aligned with new products for the marketing department.
  • Generating complete 360° views of people and companies in the sales department.
  • Matching enterprise needs to people and projects for HRs.
  • Finding regulations relating to specific contracts and investments assets in the finance department.

Conclusion

With the never ending increase of data in enterprises, organizing your information without a strategy means not being able to stay competitive and relevant in the digital age. Ensuring that your data catalog has an enterprise Knowledge Graph is essential for avoiding the dreaded ‘black box’ effect.

Through a knowledge graph in combination with AI and machine learning algorithms, your data will have more context and will enable you to not only discover deeper and more subtle patterns but also to make smarter decisions.

For more insights on what is a knowledge graph, here is a great article by BARC Analyst Timm Grosser “Linked Data for Analytics?

Start Your Data Catalog Journey

Actian Data Intelligence Platform is a 100% cloud-based solution, available anywhere in the world with just a few clicks. By choosing the Actian Data Intelligence Platform Data Catalog, control the costs associated with implementing and maintaining a data catalog while simplifying access for your teams.

The automatic feeding mechanisms, as well as the suggestion and correction algorithms, reduce the overall costs of a catalog, and guarantee your data teams with quality information in record time.

actian avatar logo

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Actian data intelligence solutions help streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data division of HCLSoftware, at actian.com.
Data Platform

Why Actian Data Platform is 8X Faster Than Snowflake

Emma McGrattan

October 28, 2020

Actian Avalanche is 8x faster than Snowflake

GigaOm published a comprehensive evaluation of leading cloud data warehouse services based on performance and cost. Offerings analyzed included Snowflake, Amazon Redshift, Microsoft Azure Synapse, Google BigQuery, and our very own Actian Data Platform.

There are many intriguing results included in the report, but an indisputable conclusion was reached by GigaOm: “In a representative set of corporate-complex queries from TPC-H standard, Actian Data Platform consistently outperformed the competition.”

To put “outperformed” in more concrete terms, Actian Data Platform was 8.5X faster than Snowflake in a test of 5 concurrent users. In terms of price performance, the advantage over Snowflake was 6.4X.

gigaom chart price performance

Actian Data Platform Was Built for Performance Out-of-the-Box

So what’s the secret behind Actian Data Platform’s superior performance? There is a simple, fundamental explanation. Actian Data Platform was built from the ground up to deliver unrivaled performance on commodity infrastructure. Its original design goal was to makes the most of every CPU clock cycle, every byte of memory, and every I/O operation. And ensuring high performance continues to be a priority for us. Our efficient design is the reason why, even with the limitless resources of the cloud, you won’t see your costs ballooning as you increase concurrent users or data volume.

How specifically does Actian Data Platform deliver best-in-class analytics performance without the need for tuning? It is a combination of the following eight factors. Vendors such as Snowflake may offer a few of the capabilities listed such as Vector processing, but the unique combination creates a powerful compounding effect:

  1. Optimizing the Use of Microprocessor Cores to run multiple data operations in parallel during a single CPU cycle. This is known as Vector processing. Traditional scalar architectures typically consume myriad more CPU cycles to compute the same calculations, which impacts overall throughput.
  2. Taking Full Advantage of Multi-Core CPUs – Actian Data Platform can perform Vector processing across all available cores, which maximizes concurrency, parallelism, and system resource utilization.
  3. Processing Data Using the CPU’s On-Chip Cache is faster and closer to where operations are performed and therefore optimizes performance. Our competitors tend to use DRAM for query execution cache, which is far slower.
  4. Using Advanced Compression – Typically With a 5:1 Compression Ratio – Actian Data Platform’s compression algorithms are designed for maximum efficiency, particularly in decompression, yet still deliver about a 5:1 compression ratio. Compression is handled automatically, so no tuning is required.
  5. Optimizing I/O – Actian Data Platform is a pure columnar implementation. The data lives its life in columnar format, which results in I/O efficiency.
  6. Using Patented Technology to Maintain Indexes Automatically so that an indexing strategy is not necessary.
  7. MPP Architecture parallelizes query execution within and across nodes to power through business workloads regardless of size and complexity.
  8. Real-Time Updates that enable operational insights with zero performance penalty enabled by Actian Data Platform’s patented positional delta trees.

All this adds up to blazing-fast analytics performance that results in faster iterations on data models, quicker root cause analysis, and ultimately enabling a data-driven organization.

When Cost is More Important Than Performance

If you’re satisfied with the current performance of your data warehouse, Actian Data Platform can deliver that same level of performance while enabling you to dial back your spend on compute resources—which can immediately translate into considerable cost savings. Conversely, if you can benefit from increased performance, Actian Data Platform can give you much more at a much lower cost. In other words, you have two levers to play with – price and performance – and Actian Data Platform enables you to achieve the lowest cost of ownership for the level of performance your business demands.

emma mcgrattan blog

About Emma McGrattan

Emma McGrattan is CTO at Actian, leading global R&D in high-performance analytics, data management, and integration. With over two decades at Actian, Emma holds multiple patents in data technologies and has been instrumental in driving innovation for mission-critical applications. She is a recognized authority, frequently speaking at industry conferences like Strata Data, and she's published technical papers on modern analytics. In her Actian blog posts, Emma tackles performance optimization, hybrid cloud architectures, and advanced analytics strategies. Explore her top articles to unlock data-driven success.