12-Core ARM Cluster Benchmarked Against Intel Atom, Ivy Bridge, AMD Fusion
Last week I shared my plans to build a low-cost, 12-core, 30-watt ARMv7 cluster running Ubuntu Linux. The ARM cluster that is built around the PandaBoard ES development boards is now online and producing results... Quite surprising results actually for a low-power Cortex-A9 compute cluster. Results include performance-per-Watt comparisons to Intel Atom and Ivy Bridge processors along with AMD's Fusion APU.
As talked about in last week's preview, six PandaBoard ES development boards were used to form this cluster. A single PandaBoard ES is already quite decent in terms of ARMv7 performance when running Ubuntu 12.04 thanks to improvements made in supporting the board's ARM SoC, Ubuntu switching to hardfp packages by default, and other Linux optimizations coming out of upstream and the Linaro camp. The PandaBoard ES uses the OMAP4460 SoC (this is an upgrade over the original PandaBoard bearing an OMAP4430 with 1.0GHz Cortex-A9 MPCore) from Texas Instruments that provides a 1.2GHz dual-core ARM Cortex-A9 processor. (The OMAP4460 also has PowerVR graphics, but that is not important for these cluster purposes.) On the PandaBoard ES there is 1GB of system memory, 10/100 Ethernet, two USB 2.0 ports, HDMI output, and an SD/SDHC slot for storage.
This Phoronix twelve ARM core cluster, which is dubbed "Effimaß", is uniquely constructed out of a dish drying rack. As far as the reasoning for this, "One of the unusual things I'm trying for this build is to assemble it all within a wooden dish drying rack. This isn't the first time that ARM development boards have been used in a cluster, with Ubuntu/Linaro and others using PandaBoard clusters for their build farm, etc. The other approaches to efficiently managing all of the boards with minimal space has been stacking them with spacers between the PCBs, etc...The issues I see with that though is it makes the boards not swappable at all without dismantling the entire stack, time consuming to setup, and requires special parts...These racks can be found for a few dollars on the Internet, can be used almost "out of the box", would allow for multiple different development boards / PCB sizes / mounting hole differences, very easy to swap out boards, could be fabricated from scratch quite easily, and allow for fairly high density clusters in compact space. The shape should also allow for managing cables and placing of AC power supplies (underneath) fairly easy. The size of this dish drying rack though for a current six-board cluster is a bit large, but this concept may end up working quite well for others."
Each PandaBoard ES had a 16GB SDHC Class-10 card for storage and the head node was using NFS to share a home directory to the other nodes. MPICH2 was being used for the MPI cluster configuration atop Ubuntu 12.04. Ubuntu 12.10 offers some remarkable ARM performance gains on the OMAP4 hardware due to the newer Linux kernel (version 3.4 at present, compared to Linux 3.2 on Ubuntu 12.04) and the major compiler upgrade (GCC 4.7 vs. GCC 4.6), but due to some early configuration problems with the post-alpha-one snapshot, the installations were reverted to Ubuntu 12.04 LTS. Ubuntu 12.10 will be loaded up on this compute cluster in the coming weeks and should result in double-digit gains.
The PandaBoards can be powered off USB, but I ended up using AC adapters for each of the PandaBoards in order to be able to better monitor the overall power draw in different configurations using a WattsUp USB-based AC power meter that then interfaces with the Phoronix Test Suite for automated power monitoring while benchmarking.