Intel Publishes A Bunch Of Code Samples For Helping To Optimize For Their Latest CPUs

Written by Michael Larabel in Intel on 10 June 2021 at 01:00 AM EDT. 2 Comments
Intel has maintained a lengthy "Optimization Reference Manual" for showing developers how to optimize code for their latest CPU microarchitectures, but accompanied by their latest manual update is now a lot of actual code samples for easing the process of learning about Intel's optimization techniques for taking full advantage of their latest processors.

Outside of the open-source/Linux support enabling and other key areas directly related to the bring-up of their new hardware, Intel's engineers already do a lot in the name of performance for open-source projects like often times contributing optimizations directly and other features to take advantage of their latest processor features in popular open-source projects. We've covered such Intel contributions countless times on Phoronix over the years.

From those open-source code contributions direct from Intel as well as the open-source code maintained by them in their projects like within oneAPI, independent developers can already glean a lot about optimization techniques and best leveraging their latest and greatest processors. There is also the Intel 64 / IA-32 Architectures Optimization Reference Manual, but as a nice helper now is that manual is accompanied by working (buildable) code samples as a much easier initial step on the learning curve for Intel code optimizations.

Intel's latest Optimization Reference Manual can be found at while the new and exciting element is the intel/optimization-manual on GitHub.

This new GitHub repository provides working code samples that go along with the optimization manual. All of these code samples can be easily built in full with the CMake build system on Linux and using any semi-recent code compiler on roughly any Intel Haswell CPU or newer.

These new code samples principally involve AVX / AVX2 / FMA optimizations, INT8 deep learning inference, and AVX-512 usage for best practices when targeting the newest Intel CPUs like Xeon Scalable Ice Lake, Tiger Lake, and Rocket Lake. It's great seeing these continued open contributions by Intel engineers outside of all their other open-source code contributions and other engagements.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via

Popular News This Week