Habana Labs Driver Drops Default Memory Scrubbing For Better Performance, Other Changes
Intel-owned Habana Labs has submitted their set of driver updates to char/misc ahead of the upcoming Linux 5.14 merge window.
Habana Labs driver continues to support their Gaudi accelerator for AI training workloads and Goya as their accelerator optimized for AI inference. Plus this open-source kernel driver continues to prepare for future Habana Labs hardware.
With the changes staged ahead of Linux 5.14 it's been a rather busy cycle of making a number of improvements to this AI accelerator kernel driver:
- Memory scrubbing is now disabled by default in the name of performance. Up to now the Habana Labs driver would scrub the memory after every memory unmap but now that is no longer the case as it sacrifices too much performance. Those wishing this security feature be enabled can use the habanalabs.memory_scrub=1 module option.
- While memory scrubbing is now disabled by default, this kernel will add the ability to reset the device after a user/client has finished using it to ensure everything is clear.
- Async device probing is now enabled for a faster load process for servers with multiple accelerators.
- The communication protocol with the firmware has been changed to improve backwards compatibility while also being more stable.
- The cause of hard resets is now reported back to the firmware after the event.
- Other firmware error reporting improvements and handling improvements in general.
- New debug interfaces and other improvements.
The full list of Habana Labs driver changes for Linux 5.14 can be found via this pull request.
Habana Labs driver continues to support their Gaudi accelerator for AI training workloads and Goya as their accelerator optimized for AI inference. Plus this open-source kernel driver continues to prepare for future Habana Labs hardware.
With the changes staged ahead of Linux 5.14 it's been a rather busy cycle of making a number of improvements to this AI accelerator kernel driver:
- Memory scrubbing is now disabled by default in the name of performance. Up to now the Habana Labs driver would scrub the memory after every memory unmap but now that is no longer the case as it sacrifices too much performance. Those wishing this security feature be enabled can use the habanalabs.memory_scrub=1 module option.
- While memory scrubbing is now disabled by default, this kernel will add the ability to reset the device after a user/client has finished using it to ensure everything is clear.
- Async device probing is now enabled for a faster load process for servers with multiple accelerators.
- The communication protocol with the firmware has been changed to improve backwards compatibility while also being more stable.
- The cause of hard resets is now reported back to the firmware after the event.
- Other firmware error reporting improvements and handling improvements in general.
- New debug interfaces and other improvements.
The full list of Habana Labs driver changes for Linux 5.14 can be found via this pull request.
1 Comment