Hadoop Short Circuit Reads And Database Performance
  • August 2, 2016

Hadoop Short Circuit Reads and Database Performance

If you've been working with Hadoop then you've likely come across the concept of Short Circuit Reads (SCRs) and how they can aid performance. These days they are mostly enabled by default (although not in "vanilla" Apache or close derivatives like Amazon EMR). Actian VectorH brings high performance SQL, ACID compliance, and enterprise security…

Read More
Efficient ETL In An Analytical Database?
  • June 27, 2016

Efficient ETL in an Analytical Database?

Recently I worked on a POC that required some non-standard thinking. The challenge was that the customer's use case did not only need high performance SQL analytics but also a healthy amount of ETL (Extract, Transform, and Load). More specifically, the requirement was for ELT (or even ETLT if we want to absolutely precise).…

Read More
  • May 3, 2016

Amazon EMR as an Easy-to-set-up Hadoop Platform

Recently I helped a customer perform an evaluation of Actian Vector in Hadoop (VectorH) to see if it could maintain “a few seconds” performance as the data sizes grew from one to tens of billions of rows (which it did, but that’s not the subject of this entry). The customer in question was only…

Read More
Taking Advantage Of Ordered Data
  • May 2, 2016

Taking Advantage of Ordered Data

Actian Vector and Vector in Hadoop (VectorH) have a lightweight, but very potent, feature that can give a significant performance boost when querying data that possesses some kind of ordering. This entry takes a look at this feature and describes how to use it. The feature utilises what are referred to as MinMax structures.…

Read More