Blog | Data Management | | 3 min read

Bloor Spotlight Highlights How Actian’s Ingres NeXt Avoids Pitfalls

avoid modernization pitfalls with Actian

Digital transformation requires use of the latest technologies. However, as you probably already know, modernizing a mission-critical database and the applications that interact with it can be risky and expensive, often turning into a long disruptive journey. But I have good news! According to a recent Bloor Spotlight report, Actian’s Ingres NeXt strategy for modernizing Ingres and OpenROAD applications either avoids or proactively addresses these potential pain points.

Bloor Senior Analyst Daniel Howard comments:

Ingres NeXt is worth paying attention to because it acknowledges both the massive need and desire for digital transformation and modernization as well as the difficulties and shortcomings of conventional approaches to them, then takes steps to provide the former while mitigating the latter.”

Let’s look at the top four obstacles that stand in the way of modernization:

It’s Risky

Less than half of modernization projects are successful. Complex dependencies among databases, applications, operating systems, hardware, data sources, and other structures increase the likelihood that something will go wrong. In addition, organizations are likely to make poor decisions at some point since there are few modernization best practices to guide the way.

It’s Expensive

Modernization typically requires Capital Expenditure (CapEx) justification. Although modernization can potentially save money and increase revenue in the long run, it can be difficult to prove that this will significantly outweigh the costs of maintaining your legacy systems over time. It can also be challenging to get a modernization initiative approved as part of an innovation budget. Innovation budgets are often quite small. According to Deloitte’s analysis, the average IT department invests more than half of its technology budget on maintaining business operations and only 19% on building innovative new capabilities.

It’s a Long Journey

Modernization can involve replacing thousands of hours’ worth of custom-developed business logic. Code may be stable, but it is perceived as brittle if it cannot be changed without great pain. Missing documentation, third-party applications, and libraries that are often no longer available can add time and complexity to a modernization project. Plus, many developers are simply unaware of conversion tools for updating “green screen” ABF applications and creating web and mobile versions.

It’s Disruptive

Mission-critical databases and applications require near 100% availability, so modernization requires careful planning and execution. Plus, technical staff and business users will need to be retrained and upskilled to make the most of new technologies.

How Exactly Does Ingres NeXt Avoid or Address These Pain Points?

The report discusses how automated migration utilities, asset reuse, and a high degree of flexibility and customization—among other things—result in a solution that can streamline your organization’s path to a modern data infrastructure.


Blog | Data Intelligence | | 4 min read

Data Sampling: Create Subsets for a More Fluid Data Analysis

data-sampling-article-zeenea

Your data culture is growing! But if the amount of data at your disposal is exploding, then you may find it difficult to handle these colossal volumes of information. From then on, you will have to work based on a sample that is as representative as possible. This is where Data Sampling comes in.

 As the range of your data expands and your data assets become more massive, you may one day be faced with a volume of data that will make it impossible for your query to succeed. The reason: insufficient memory and computing processing. A paradox when all the efforts made up to now have been to guarantee excellence in the collection of voluminous data.

But don’t be discouraged. At this point, you will need to resort to Data Sampling. Data Sampling is a statistical analysis technique used to select, manipulate, and analyze a representative subset of data points. This technique allows you to identify patterns and trends in the larger data set.

Data Sampling: How it Works

Data Sampling enables data scientists, predictive modelers, and other data analysts to work with a small, manageable amount of data on a statistical population.

The goal: to build and run analytical models faster while producing accurate results. The principle: refocus analyses on a smaller sample to be more agile, fast, and efficient in processing queries.

The subtlety of data sampling lies in the representativeness of the sample. Indeed, it is essential to apply the most suitable method to reduce the volume of data to be taken into consideration in the analysis without degrading the relevance of the results obtained.

Sampling is a method that will allow you to obtain information based on the statistics of a subset of the population without having to investigate each individual. Because it allows you to work on subsets, Data Sampling saves you valuable time because it does not analyze the entire volume of data available. This time saving translates into cost savings and, therefore, a faster ROI.

Finally, thanks to Data Sampling, you make your data project more agile, and can then consider a more frequent recourse to the analysis of your data.

The Different Methods of Data Sampling

The first step in the sampling process is to clearly define the target population. There are two main types of sampling: probability sampling and non-probability sampling.

Probability sampling is based on the principle that each element of the data population has an equal chance of being selected. This results in a high degree of representativeness of the population. On the other hand, data scientists can opt for non-probability sampling. In this case, some data points will have a better chance of being included in the sample than others. Within these two main families, there are different types of sampling.

Among the most common techniques in the probability method, simple random sampling is one example. In this case, each individual is chosen at random, and each member of the population or group has an equal chance of being selected.

With systematic sampling, on the other hand, the first individual is selected at random, while the others are selected using a fixed sampling interval. Therefore, a sample is created by defining an interval that derives the data from the larger data population.

Stratified sampling consists of dividing the elements of the data population into different subgroups (called strata), linked by similarities or common factors. The major advantage of this method is that it is very precise with respect to the object of study.

Finally, the last type of probability sampling is cluster sampling, which divides a large set of data into groups or sections according to a determining factor, such as a geographic indicator.

In all cases, whether you choose probabilistic or non-probabilistic methods, keep in mind that in order to achieve its full potential, data sampling must be based on sufficiently large samples! The larger the sample size, the more accurate your inference about the population would be. So, ready to get started?


Blog | Databases | | 3 min read

Vector 6.2 Delivers Even Faster and More Secure Analytics

analytics database

Actian Vector is a vectorized columnar analytics database designed to deliver extreme performance with a high level of security. But who wouldn’t want analytics that are even faster and more secure? For those who do, the new Vector 6.2 release available on November 10, 2021, is a winner, continuing to push the limits of what is possible in operational data warehousing. Here’s a summary of just a few of Vector’s most important new features.

Vector 6.2 is Faster

Vector has long been the industry’s fastest analytics database. It’s designed for speed and efficiency using column-based storage and vector processing. With the incorporation of query result caching and queue-based workload management in the 6.2 release, Vector just got another performance boost.

Query Result Caching

If you’ve previously run a query and the data hasn’t changed since the last run, Vector doesn’t need to run the full query again. Instead, it leverages query result caching to retrieve the previous result instantaneously from the cache. This substantially reduces the time and system resources required to obtain the insights you seek.

Queue-Based Workload Management

Queue-based workload management dynamically adjusts workload queues based on available resources and resource quotas. Key benefits include:

  • Prevents system overload and workload starvation by limiting resource usage per the database administrator’s configuration.
  • Allows system administrators to quickly run high-priority reports even on a fully loaded system.
  • Enables the user to have very small queries answered independently of system load.
  • Manages usage by prioritizing resources appropriately across people, groups, and applications, even enabling specific profiles to run at different priorities during specified time slots.

Vector 6.2 is More Secure

Re-Keying Encryption

Previous releases of Vector have enabled organizations to maintain tight security for sensitive data, with support for encryption at rest and in transit as well as dynamic data masking of fields containing personally identifiable information and other sensitive data. Vector 6.2 extends security further, providing the ability to re-key the database with new encryption keys as recommended by NIST guidelines. This feature is valuable since it enables an enterprise to limit the amount of time a bad actor can use a stolen key to access your data.

Secure User-Defined Functions

This new release enhances security by executing Python, JavaScript and NumPY UDFs within a container that is sandboxed from the rest of the database. The NumPY UDF support is new and enables the vector-based execution of numeric data.   

Learn More About Vector

Visit our website to learn more about analytics database Vector’s extensive performance optimization, features, and use cases.