Show Your Support: This site is primarily supported by advertisements. Ads are what have allowed this site to be maintained on a daily basis for the past 18+ years. We do our best to ensure only clean, relevant ads are shown, when any nasty ads are detected, we work to remove them ASAP. If you would like to view the site without ads while still supporting our work, please consider our ad-free Phoronix Premium.
Multi-Core Scaling Performance Of AMD's Bulldozer
There has been a lot of discussion in the past two weeks concerning AMD's new FX-Series processors and the Bulldozer architecture. In particular, with the Bulldozer architecture consisting of "modules" in which each has two x86 engines, but share much of the rest of the processing pipeline with their sibling engine; as such, the AMD FX-8150 eight-core CPU only has four modules. In this article is a look at how well the Bulldozer multi-core performance scales when toggling these different modules. The multi-core scaling performance is compared to AMD's Shanghai, Intel's Gulftown and Sandy Bridge processors.
Each Bulldozer module consists of two x86 out-of-order processing engines, two 128-bit FMAC units, and two integer cores, but shares the fetch/decode stage, the floating-point scheduler, the L2 cache, and other parts of the module. Some have loosely compared this to Intel's Hyper Threading technology. Below are slides provided by AMD that detail the AMD Bulldozer model.
In the Linux benchmarks of the AMD FX-8150 that were published this past Monday on Phoronix, the multi-core performance of the eight-core Bulldozer was shown to be comparable to that of Intel's Sandy Bridge (Core i5 2500K) and Gulftown (Core i7 970, Core i7 990X) CPUs in some of the workloads. Today's results are a new set of numbers when running the very multi-threaded-friendly Linux benchmarks and controlling the number of modules/cores that are enabled.
The UEFI on the ASUS Crosshair V Formula motherboard, which was part of the Bulldozer kit sent over by AMD, allows enabling/disabling the individual cores of the Bulldozer CPU. The multi-threaded benchmarks were run with one, two, four, six, and eight core/thread configurations. When testing with four or less, it was ensured that each core enabled was to its own module and not shared. Likewise, with the comparative Intel results, each physical core was allotted first before enabling the Hyper Threading. For the motherboards/CPUs that don't support the individual toggling of cores, the core count was limited in the Linux kernel by using the "maxcpus=" kernel option, which limits the number of cores that are exposed to the operating system.