Actian VectorH 5.0 Significantly Outperforms Impala, Hive and HAWQ in Recent Benchmark

  • With the release of VectorH 5.0, Actian extends its lead in providing customers the fastest open and enterprise-ready SQL in Hadoop solution available today
  • Tight integration with Apache Spark opens up access to new data sources and allows developers to build high-performance Streaming, ETL, and Machine Learning applications against VectorH
  • Actian VectorH 5.0 provides the enterprise-grade features customers need to move Hadoop analytics into production

PALO ALTO, Calif. – June 28, 2016 – Actian Corporation, a leader in enterprise-grade data analytics infrastructure, announced today the latest version of the Actian Vector in Hadoop (VectorH) database, generally available at the end of July. VectorH is based on the same query engine that powers Actian Vector, which recently doubled the TPC-H benchmark record for non-clustered systems at the 3000GB scale factor (see tpc.org/3323).

The ability to easily ingest information from different data sources and rapidly develop queries to make better business decisions is becoming increasingly important, particularly for those companies looking to respond to changes in real-time or explore Machine Learning. When paired with Actian VectorH, the industry’s fastest Enterprise SQL database that sits natively in Hadoop, Spark users have a new powerful way to help derive true business value from their data.

“VectorH fits naturally in the Hadoop architecture and delivers end-to-end scalable performance,” said Mark Milani, senior vice president, Product Engineering at Actian. “Tighter integration with Spark makes it easier for our customers to leverage data in different formats and from different sources, and take advantage of the performance of a robust, secure database engine in VectorH. We are excited to bring this offering to our customers.”

Spark integration is another example of Actian’s continuing commitment to incorporating open interfaces and frameworks directly into the VectorH solution. In today’s Hadoop marketplace, innovation is coming from many different sources and projects. Actian VectorH 5.0 integrates with the latest Hadoop distributions from MapR, Cloudera and Hortonworks, and can be deployed both on-premises and in the cloud. Actian provides customers with the flexibility and support needed when integrating with other big data technologies to deliver faster and richer insights to make better business decisions.

VectorH Beats Out Competitors by Orders of Magnitude

On June 29th the Vector architects will present a paper at SIGMOD, the premiere conference for database professionals and academics, that demonstrates the superior performance and capabilities of VectorH 5.0 when compared with some of the most popular SQL in Hadoop solutions: Apache Hive, Cloudera Impala, Apache Spark SQL and Pivotal HAWQ.

The tests were based on the TPC-H query set, running on a 10-node cluster at the 1000GB scale, and show Actian VectorH outperformed the competition by orders of magnitude. The research attributes the Actian VectorH performance differential, which ranged from just under 10X to almost 1000X, to a combination of end-to-end vector processing, mature query optimization techniques, intelligent I/O and lightweight compression algorithms. The research also demonstrated fast and efficient trickle update capabilities from Actian VectorH and identified shortcomings in Hive’s recent attempt at providing support for updates.

The query workload that was tested was designed by an industry body to be representative of a medium complexity ad-hoc decision support workload. The results show that VectorH can run within seconds queries that take the SQL in Hadoop competition up to 20 minutes to run, even after they have been optimized to perform to their best of their ability.

Supporting Resources:

  • Actian blog posts with performance details.
  • SIGMOD paper published in the Proceedings of the 2016 International Conference on Management of Data, pages 1105-1117. SIGMOD accepts papers for presentation after independent peer review on innovative commercial data management systems, solutions, and architectures.

About Actian:

Actian is a leading data management, integration and analytics infrastructure company. It delivers the world’s fastest big data analytics platform on commodity hardware, in the cloud or both. With more than 10,000 customers across a broad range of industries, it helps leading brands like General Electric, Lufthansa, Intuit, Arbor Health, and Siemens solve their toughest data challenges to transform how they run and analyze their businesses. The company is headquartered in Silicon Valley and has offices worldwide. Stay connected with Actian Corporation at www.actian.com, Facebook, Twitter and LinkedIn.

# # #

Actian, Actian Analytics Platform, Actian VectorH and Actian Analytics Database – Vector are trademarks of Actian Corporation and its subsidiaries. All other trademarks, trade names, service marks, and logos referenced herein belong to their respective companies.