Another Linux 4.20 Performance Regression Has Now Been Addressed (THP)
The bumpy Linux 4.19~4.20 road continues but at least another performance regression is now crossed off.
Google's David Rientjes has landed a patch in mainline Linux 4.20 Git as of yesterday that restores node-locale hugepage allocations. Changes to Transparent Huge-Pages, which THP itself was designed to improve performance and make it easier to utilize huge-pages, had caused a performance regression to be introduced back during the 4.20 merge window.
In terms of the 4.20 performance regression, "On Haswell, [one of the problematic commits] was shown to have a 13.9% access regression after this commit for binaries that remap their text segment to be backed by transparent hugepages... If remote memory is also low or fragmented, not setting __GFP_THISNODE was also measured on Haswell to have a 40% regression in allocation latency."
More details on the changes to Transparent Huge-Pages via this kernel commit.
Google's David Rientjes has landed a patch in mainline Linux 4.20 Git as of yesterday that restores node-locale hugepage allocations. Changes to Transparent Huge-Pages, which THP itself was designed to improve performance and make it easier to utilize huge-pages, had caused a performance regression to be introduced back during the 4.20 merge window.
In terms of the 4.20 performance regression, "On Haswell, [one of the problematic commits] was shown to have a 13.9% access regression after this commit for binaries that remap their text segment to be backed by transparent hugepages... If remote memory is also low or fragmented, not setting __GFP_THISNODE was also measured on Haswell to have a 40% regression in allocation latency."
More details on the changes to Transparent Huge-Pages via this kernel commit.
Add A Comment