It was surprisingly easy to analyze 165 million flight records my laptop. It took me just an afternoon following the Actian Evaluation guide that you can download from here.
Scientists with with Intel over the years to being down the cost of high performance computing. They key vector processing technology feature they needed was to analyse large arrays of data in a single CPU instruction cycle. Actian has accelerated standard SQL database requests to take advantage of vectorization. Actian Vector translates standard SQL into relational algebra so your queries can respond often in 100th of the time it would have than with a standard relational database. Since joining Actian, I saw demonstrations and heard customers rave about Actian Vector, so I jumped on the idea of try it for myself so I could create a how-to video. The evaluation guide stepped me through the database install, sample data load and provided queries to run against a the 165 million row data set containing historic airline flight records.
My laptop has a multi-core 64-bit Intel processor and an available 106 GB of disk space needed to try Vector for myself. It took me just an afternoon to to run through the process of downloading the software with the raw flight data, create database, installing, loading and running the six supplied queries. Unzipping the more than 300 CSV files for the raw data was longest step. The supplied load scripts create a fact table and a single dimension table. I didn’t create and indexes or perform any tuning. I simply created the tables and generated statistics to inform the query optimizer about the data.
I have installed databases including relational databases Oracle, DB/2 and SQL/DS. Never has getting to this kind of performance been so easy. I recorded the whole process and edited it down to a seven-minute video so you can see every step for yourself by clicking