Cloudflare Praises Golang PGO For Significant CPU Savings
Released over a year ago was Golang 1.20 with support for Profile Guided Optimizations (PGO) and has since been improved with Go 1.21 for 2~7% faster Go binaries thanks to this optimization approach also found with other compilers. The engineers at Cloudflare have put out a blog post this week praising Go's PGO support and the CPU savings they are seeing as a result.
Compiler PGO support is great, assuming you have a sufficient collection of samples for serving as the profile to feed back into the compiler so it can suitably optimize the code. The compiler can make more informed optimizations based on the collected profile/feedback but does lead to the extra step involved compared to more easily applied compiler optimizations. In the case of Golang's PGO, some codebases can see as much as a 14% improvement.
With Cloudflare having some Go-based services that rely on thousands of CPU Cores worldwide, they recently took to exploring Golang's PGO impact on their infrastructure. Their result:
Considering today's server costs as well as the TCO with energy and cooling costs, saving several servers worth of CPU time is significant.
This is an efficiency win with minimum investment needed. Moving forward they are also going to be exploring more profiling, further optimizations via BOLT or LTO optimizations, and other tuning. More details over on the Cloudflare blog.
Compiler PGO support is great, assuming you have a sufficient collection of samples for serving as the profile to feed back into the compiler so it can suitably optimize the code. The compiler can make more informed optimizations based on the collected profile/feedback but does lead to the extra step involved compared to more easily applied compiler optimizations. In the case of Golang's PGO, some codebases can see as much as a 14% improvement.
With Cloudflare having some Go-based services that rely on thousands of CPU Cores worldwide, they recently took to exploring Golang's PGO impact on their infrastructure. Their result:
"This indicates that following the release, we’re using ~97 cores fewer than before the release, a ~3.5% reduction. This seems to be inline with the upstream documentation that gives numbers between 2% and 14%.
The second number we can look at is the usage at the same time of day on different days of the week. The average usage for the 7 days prior to the release was 3067.83 cores, whereas the 7 days after the release were 2996.78, a savings of 71 CPUs. Not quite as good as our 97 CPU savings, but still pretty substantial!
This seems to prove the benefits of PGO – without changing the code at all, we managed to save ourselves several servers worth of CPU time."
Considering today's server costs as well as the TCO with energy and cooling costs, saving several servers worth of CPU time is significant.
This is an efficiency win with minimum investment needed. Moving forward they are also going to be exploring more profiling, further optimizations via BOLT or LTO optimizations, and other tuning. More details over on the Cloudflare blog.
28 Comments