My interest in GPGPU's is driven by their incredible performance in crunching very large amounts of data for machine learning. In this particular case, Mark Litwintschik posted on his blog Monday his first effort to benchmark his dataset of 1.1 billion taxi rides in NYC using OpenCL and LLVM running on 8 NVidia Tesla K80's in a single box provided by MapD.

An excerpt from his blog post shows his work, its a fascinating read.

MapD have been kind enough to grant me access to a machine that I'll use to benchmark their GPU-based database software with the 1.1 billion taxi trips dataset. The machine I'll be using would have been one of the world's 20 fastest computers 10 years ago. It has 8 x Nvidia Telsa K80s, each with 2 GPUs per card. The K80 has 2.91 teraflops of double-precision performance and 8.73 teraflops single-precision performance giving me 23.28 and 69.84 teraflops of performance respectively.

The machine is running CentOS 7.2.1511 on a 32-core Intel Xeon E5-2667 v3 clocked at 3.2 GHz with 792 GB of RAM.

I'll be using a RAID array to store both the raw CSV files and MapD's internal columnar-based files it uses to represent the dataset. The RAID array is comprised of 4 x Samsung EVO 2 TB SSDs in a RAID 10 configuration with a LSI Logic / Symbios Logic MegaRAID SAS2108 RAID bus controller making for a total of 4 TB of usable storage. This configuration should see sequential read speeds of up to 500 MB/s.

Mark saw certain queries perform in some cases 55x faster. Enjoy the read.