Where You Do Analytics Processing Matters
Actian Vector was renamed to Actian Analytics Engine in 2026.
La solución Vector for Hadoop de Actian ofrece un mayor rendimiento en las consultas analíticas sin el consiguiente aumento de costes. Si busca un procesamiento analítico de alto rendimiento para impulsar la toma de decisiones operativas, el lugar donde realice dicho procesamiento es fundamental. Al minimizar el movimiento de datos y procesarlos localmente, puede reducir drásticamente la latencia. Al utilizar un sistema como Actian Analytics Enginepara llevar a cabo ese procesamiento local, puede alcanzar niveles de rendimiento aún mayores.
On the Box, in the Datacenter or Across the Country
When people hear the statement “where you do your processing matters,” the first thought that comes to mind is network latency. It’s easy to understand how transmitting data over the internet, across the country, or even across town, can slow down your processing. The same holds true within your data center. Co-locating storage and compute near each other (on the same rack or even the same device) decreases processing latency.
Many companies are leveraging cloud services and distributed systems to increase performance for end-user OLTP operations. When it comes time to perform analytics, the distance issue comes into play again. Where should you be doing your analytics processing? For most companies, the cloud is the right place to host your data warehouse and perform analytics compute because it enables you to locate your analytics closer to your data stores and, at the same time, leverage cloud-scale compute resources.
Assuming you’ve addressed these “big distance” issues, is it possible to optimize further? Yes, it is. If big data processing or real-time analytics to drive operations and decision-making are the goals you are trying to achieve, you need to take your analytics performance to the next level and look at how the databases, and software you use can be optimized to take maximum advantage of the resource capacity available.
Disk is Slow. Memory is Better. Chip Cache is the Fastest
Let’s take a look at what happens within an analytics system (the hardware and software you use). These systems are typically comprised of three hardware components that have a direct influence on performance – disks, memory, and chip cache. When you perform compute operations (which are really just a bunch of mathematical formulas), you are manipulating data that is stored in one of these three places. Chips have some internal cache memory, which offers the fastest performance but the smallest capacity. RAM memory chips have more capacity (though it is limited) and performance that is fairly fast because data is temporarily held in a suspended state instead of written to a physical medium but much slower than chip cache. Disk storage is slowest because data is written to a physical media (a disk) and read from this physical media when it needs to be accessed. With cloud storage, the disk capacity available is nearly unlimited.
Los sistemas de almacenamiento de datos y análisis utilizan cada uno de estos tipos de almacenamiento, junto con la capacidad de cálculo de las CPU, de diferentes maneras. Esto es lo que confiere a Actian Analytics Engine una ventaja en cuanto a rendimiento frente a otras soluciones. Analytics Engine optimiza el uso de cada capa de la infraestructura del sistema, eliminando el desperdicio de capacidad para maximizar el rendimiento y minimizar los costes. A continuación se muestran un par de ejemplos:
Maximize Utilization of CPU Cores
Las CPU modernas cuentan con múltiples núcleos, lo que significa que pueden ejecutar varias operaciones al mismo tiempo. Lamentablemente, la mayoría de los programas (incluidos los sistemas de almacenes de datos) no están diseñados para aprovechar esta capacidad de procesamiento paralelo y, como resultado, solo se acaba utilizando una pequeña parte de la capacidad disponible. Actian Data Platform y Actian Analytics Engine están diseñados para ejecutar de manera eficiente un gran número de consultas simultáneas solicitadas por un gran número de usuarios. Las consultas se dividen en pequeños fragmentos que pueden ejecutarse en paralelo. Esto es importante porque maximiza el uso de la capacidad de la CPU de la que dispones. Los ciclos de CPU son una capacidad basada en el tiempo. Piensa en ello como las horas del día que tienes para tus tareas de trabajo. El reto consiste en utilizar tu capacidad disponible de la forma más eficiente posible y evitar el tiempo de inactividad, ya que, una vez transcurrido ese tiempo, nunca podrás recuperarlo.
Reducing the Amount of Data That is Written to and Read from Disks
Actian solutions are designed for highly efficient use of disks – reducing I/O operations that can slow down analytics processing. Actian Data Platform is a pure columnar database. Traditional databases are row-based – records are in rows, and you have to read the entire row to perform a query and do analytics. Actian treats data as a series of columns – this is what optimizes it for analytics processing. Because a column of data is all the same data type, analytics operations can be optimized. Going under the hood, you’ll find that each column is stored as files on the disk with various blocks of data. MinMax indexes on data blocks enable faster sorting of data by helping the platform to more efficiently identify what data the user is trying to analyze and what can be ignored.
When you are doing operational analytics and trying to drive real-time decision-making with data, you need the best performance you can get. Through a combination of increased operations taking place using chip cache and cache memory along with a more efficient process of managing the data stored to disk, Actian can optimize the performance and utilization of database hardware while at the same time minimizing the amount of data written to disk. Both of these are important because they directly translate to lower operating costs. What it comes down to is “use the resources you have more efficiently” to achieve peak performance and minimize costs.