Vector 6 Enables Machine Learning, Semi-Structured Data Support, and Enhancement of Workload Management Capabilities.
Key New Features
User-Defined Functions
User-defined functions (UDFs) let the user extend the database to perform operations that are not available through the built-in, system-defined functions provided by Vector. VectorH 6 provides the capability to create Scalar UDFs, which return at most one row, consisting of a single column/value. This means that you can run JavaScript of Python code alongside your SQL statement in a single query.
Vector supports three programming languages for UDFs: SQL, JavaScript, and Python 3.6. An additional use case for UDFs in Vector is for the deployment of machine learning (ML) models that run alongside the database. By deploying machine learning models alongside the Vector database, data movement is reduced, thus allowing for faster scoring of data.
Workload Management
Workload management in database management systems (DBMSs) is the method of monitoring and managing work (i.e., database transactions) executing within a database system. Through control of work in a database, this allows the efficient use of system resources and the ability to maintain performance objectives.
JSON Support
JSON (JavaScript Object Notation) is a popular semi-structured data format that is used for exchanging data in web and mobile applications. JSON is a flexible, easy to read, and a lightweight method for sharing data between applications.
JSON functions in Vector enable you to combine NoSQL and relational concepts in the same database. Now you can combine classic relational columns with columns that contain documents formatted as JSON text in the same table and parse and import JSON documents in relational structures.
Platform Support
Actian VectorH 6 continues to support RHEL, CentOS, SUSE Linux, Ubuntu, Cloudera, and Hortonworks. VectorH 6 is available today from esd.actian.com.
Learn More
Features

Vectorized Query Execution
Exploits Single Instruction, Multiple Data (SIMD) support in x86 CPUs
Processes hundreds or thousands of elements without the overhead traditional databases have

Maximizing CPU cache for execution
Uses private CPU core and caches as execution memory – 100x faster than RAM
Delivers significantly greater throughput without limitations of in-memory approaches

Other CPU Optimizations
Supports hardware-accelerated string-based operations, benefiting selections on strings using wild card matching, aggregations on string- based values, and joins or sorts using string keys

Column-Based Storage
Reduces I/O to relevant columns
Opportunity for better data compression
Built in storage indexes maximize efficiency

Data Compression
Multiple options to maximize compression: Run Length Encoding (RLE), Patched Frame of Reference (PFOR), Delta encoding on top of PFOR, Dictionary encoding, and LZ4: for different string values
4-6x compression ratios common for real-world data

Positional Delta Trees (PDTs)
Full ACID compliance with multi-version read consistency
Changes always written persistently to a transaction log before a commit completes to ensure full recoverability
High-performance in-memory Positional Delta Trees (PDTs) handle incremental small inserts, updates and deletes without impacting query performance

Easy data migration
Move a database to a cloud or remote datacenter in one step using the integrated “clonedb” function (two steps if you include installing Vector on the remote server)

Storage Indexes
Automatic min-max indices enable block skipping on reads
Eliminates need for explicit data partitioning strategy

Parallel Execution
Flexible adaptive parallel execution algorithms to maximize concurrency while enabling load prioritization

Flexible Deployment
Available for both on-premises and cloud deployment, including both AWS Marketplace and MS Azure

Security
Role-based security
Authentication through LDAP or Active Directory

Manageability
YARN for automated Hadoop cluster resource management
Web-based management console for monitoring analytic/query processing

Spark Powered Direct Query Access
Directly access Hadoop data files stored in Parquet, ORC, or other standard formats
Realize performance benefit without converting to Vector file format first

Native Spark DataFrame Support
Direct connection to Spark functionality via DataFrames
VectorH can accelerate query performance for Spark SQL and Spark R applications

Scale-out Hadoop Performance
Linear scalability from small to large Hadoop clusters
Supported on popular Hadoop distributions from Hortonworks, Cloudera, MapR and Apache

Zero-Penalty Real-Time Data Updates
Enables full create/read/update/delete capabilities on Hadoop
Tracks changes in memory and avoids any performance penalty for updates

Extensive SQL Support
Standard ANSI SQL enabling the use of existing SQL without rewrite
Advanced analytics, including cubing, grouping, and window functions

Mature Query Optimizer
Mature and proven cost-based query planner
Optimal use of all available resources, including node, memory, cache, and CPU

MPP Architecture
Leverages Hadoop to handle thousands of users, nodes, and petabytes of data
Exploits redundancy in HDFS to provide system-wide data protection

Compression
Compress the data by at least a factor of 10 to reduce the amount of Hadoop storage
Store the data in a columnar format for faster access
Vector:
The High-Performance Analytic Database Engine for Hadoop
Vector is the industry’s fastest analytic database. Vector’s ability to handle continuous updates without a performance penalty makes it an Operational Data Warehouse (ODW) capable of incorporating the latest business information into your analytic decision-making. Vector achieves extreme performance with full ACID compliance on Hadoop.
With Vector, you get more from your Hadoop investment.
Three Ways Vector Accelerates Hadoop Analytics

Scale-out Hadoop SQL performance
Dramatically improve performance by 100X with Actian Vector technology – enables extensibility of the world’s fastest database to accelerate SQL performance in Hadoop clusters. Workloads of standard benchmark queries that typically take over two hours with traditional SQL solutions on Hadoop finish in less than a minute running on Vector.

Zero-penalty, real-time data updates
Organizations no longer have to give up consistency for performance. Unlike traditional Hadoop analytics solutions, Vector for Hadoop can process real-time data updates without any associated performance penalty, ensuring that an organization’s analytic insight is always current, using the freshest data available.

Native Spark powered direct query access
Through its innovative native Spark support Vector delivers optimized access to Hadoop data file formats including Parquet and ORC, the ability to perform functions like SQL joins across different table types and serve as a faster query execution engine for SparkSQL and Spark R applications.

Watch this “Deep Dive” video with Actian’s SVP of Engineering, Emma McGrattan, to learn how Vector’s innovative architecture delivers superior performance at scale.
“For the past 20 years, I’ve been searching for the killer database that would fulfill most of our intense data processing needs and with the discovery of Actian Vector, that search is now over – this database is in a class of its own. Right out of the box, Actian Vector lets us effortlessly plow through millions and millions of rows of data with infinite width and depth and without the need for new expensive hardware, complicated schemas, explicit indexing, pre-aggregation, or specifically hand-crafted DBA-tuned SQL. The Actian leaders and technologists have performed a miracle here.”
— Warren Master, CTO, The Rohatyn Group
True Hybrid Platform That Lets You
Migrate According To Your Terms
Actian Avalanche is a hybrid cloud data warehouse that can be deployed on-premises as well as on AWS, Microsoft Azure, and Google Cloud. Avalanche helps you to incrementally migrate or offload from your existing enterprise data warehouse until it can be retired in a managed fashion—according to your timeframe and your terms. Choose the path that is best for you – cloud, on-premises, or a combination of both, with a seamlessly architected hybrid solution.
Gain 10-20X
performance improvement at enterprise scale
Avalanche is designed for enterprise-class workloads with high levels of concurrency and query complexity. It delivers up to 20X performance advantage over legacy data warehouse solutions–at a fraction of the cost. It is uniquely able to perform analytical queries even as the data warehouse is being updated–without adding any latency. Say good-bye to nightly batch loads.


Reduce OpEx by
50% and get rid of CapEx altogether
Since Avalanche runs on commodity hardware, its cost profile is significantly lower than that of legacy data warehouses. Avalanche typically reduces operating expense costs (e.g. appliance annual maintenance) by 50% and removes capital expense altogether. You will also take advantage of cloud economics – turn off compute resources when you choose to and pay only for what you use.
Minimize migration risk with a highly automated, non-disruptive move to Avalanche
Migrations of terabytes of data, thousands of tables and views, specialized code and data types, and other proprietary impediments to change do not happen overnight.
The Avalanche migration service ensures that over 90% of any custom code written for legacy data warehouses is migrated automatically. We migrate the rest manually by leveraging our trusted partners — making this a risk-free and worry-free exercise for you. Avalanche also delivers full interoperability with your existing back-end applications, enterprise data repositories, and new data sources.
Migrating from Teradata
Learn the secrets of migration success from the former Head of Big Data group at Teradata.
Migrating from Netezza
Learn migration best practices from a former Netezza executive.
Moving from Netezza to Actian Avalanche
Global bank realized large improvements in performance and TCO by transitioning to Actian. The bank estimates a total savings of $20 million over five years.

Actian simply outperformed the competition in nearly every category, including quickest time to value and fastest response times to market changes.
A Global Bank Risk Group achieved fast performance and high scalability when they moved from Oracle Exadata to Actian Avalanche
The Goals |
The Results |
|
---|---|---|
LOADING | 2 billion risk data points in 6 hours (~100k/sec) | 1 hour 40 min (333k/sec) |
FILTERED AGGREGATION | 30 seconds | 6 sec on 5 node cluster; 2 sec on 10 node cluster |
FULL DAY AGGREGATION | Hierarchy dimension on 1 million data points in < 15 sec | Sub second response time |
LARGE DATA VOLUMES | Store 80 days (160 billion rows) of data | Store 100 days (200 billion rows) with linear scaling |
HORIZONTAL SCALABILITY | Up to 10 billion rows per day | Text book scalability as nodes added to cluster |
DRILL UP/DRILL DOWN | < 2 sec | < 1 sec |
Persistent Myths of Cloud Data Warehouse
by Early Adopter Research

ON-DEMAND WEBCAST
Rethinking Data Warehouse Modernization
featuring James Curtis, Senior Analyst 451, Research
Hear from Jim Curtis, 451 Group’s resident expert on Data Modernization, along with Raghu Chakravarthi, Actian’s SVP of R&D who formerly ran Teradata’s Big Data Group and Paul Wolmering, VP of Solution Engineering who previously led field engineering teams at Netezza, to get the inside scoop on migration best practices.
Try DataConnect for Free
Download the new DataConnect Evaluation Edition to try out the new features (Offer valid only for next 30 days)