AVX/AVX-512 Tuning Doesn't Payoff For LibreOffice's Calc Spreadsheets
Last year AVX and AVX-512 support was added to LibreOffice such as for calculating the sum of a column of numbers within the Calc spreadsheet program. While the initial numbers were promising for delivering better performance over the SSE2 path, it turned out that the maintenance involved wasn't worthwhile and that AVX/AVX-512 tuning is now being removed from LibreOffice.
Yesterday Calc saw the removal of its AVX and AVX-512 code. Luboš Luňák commented, "It's been a source of numerous problems since the beginning. Poor separation of C++ code causing the compiler to emit some generic code as CPU-specific, compiler optimizations moving CPU-specific code out of #ifdef to unguarded static initialization, etc."
He also cites the performance not being to worthwhile, but for his reference is using a sluggish Ryzen 5 2500U. "on my Ryzen2500U for one full column (1m cells) sumArray() takes about 1.6ms with AVX, 1.9ms with SSE2 and 4.6ms with generic code. So SSE2 code is perhaps worth it, especially given that SSE2 is our baseline requirement on x86_64 everywhere and x86 on Windows, but AVX+ is nowhere near worth the trouble."
He doesn't rule out CPU-specific code for LibreOffice but that it should be working, maintained, and worth the extra effort involved. A new code comment in the merged change goes on to note, "IMPORTANT: Having CPU-specific routines turned out to be a maintenance problem, because of various problems such as compilers moving CPU-specific code out of #ifdef code into static initialization or our code using C++ features that caused the compiler to emit code that used CPU-specific instructions (even cpuid.hxx isn't safe, see the comment there). The only safe usage is using CPU-specific code that's always available, such as SSE2-specific code for x86_64. Do not use for anything else unless you really know what you are doing (and you check git history to learn from past problems)."
This AVX/AVX-512 support was quietly added for LibreOffice 7.3 and at the time of writing the patches was billed as "great speed improvements to statistical functions" but alas didn't turn out exactly as planned.