The Most Common, Annoying Issue When Benchmarking Ubuntu On Many Systems

Written by Michael Larabel in Hardware on 26 March 2015 at 04:02 PM EDT. 23 Comments
HARDWARE
When constantly benchmarking dozens of systems daily in a fully-automated manner there's one issue particularly on Ubuntu that's proved over the past few months to be most annoying...


Since launching the original LinuxBenchmarking.com test farm and going back earlier with some of the original Phoromatic trackers there's been one issue that's been most annoying. While there's been RAM issues, disk failures, corrupted file-systems, and other hardware/software issues to come up over time, the most annoying issue is quite simple and comes down to the GRUB boot-loader.


On Ubuntu, if there's a power failure or the system is otherwise left in a bad state on its last boot, GRUB2 will just hang there indefinitely waiting for user input to select the kernel to boot and any relevant kernel command-line options. This is annoying since most of the test systems are headless so when seeing an unresponsive system I have to go attach a HDMI display and USB keyboard only to find out there was no serious system error but just the system hanging at GRUB for whatever reason at that time. The Phoromatic Server of the Phoronix Test Suite will indicate (via the web UI and an automated email message) when a system's "down" so I can go investigate and a clear majority of the time it's due to Ubuntu being silly with GRUB.

However, there is a workaround. There is the GRUB_RECORDFAIL_TIMEOUT value that can be changed to override the default behavior, but it's far from perfect. If setting GRUB_RECORDFAIL_TIMEOUT=0 in /etc/default/grub followed by running update-grub, the issue can be worked around and the system will continue to boot without hanging in the failed situations.

Ubuntu seems to be the only major distribution though dealing with GRUB_RECORDFAIL_TIMEOUT and causing the issues on the failed boot. While I can manually change this GRUB value on all of the Ubuntu systems in the farm, it would be much nicer if there was a cleaner way to apply this setting automatically from an application perspective: for those running the Phoronix Test Suite, it would be nice that if the system could automatically recover if the system crashed while running the application -- given the chances considering the system is being stressed, etc. As far as I know the only way to do this would be to update the GRUB configuration twice (before and then after to restore the original default) when running the Phoronix Test Suite as root, but it seems like it would be much more convenient if PTS or similar could simply write to say /boot/autorecover (/boot/nointeraction or such) and the GRUB boot script would check for the presence of such a file when booting and then proceed on. It'd also be cleaner and more universal too (cleanup just being the removal of that file) for being able to apply in an automated manner without potentially screwing the user's GRUB configuration.

But for now it looks like the only approach is the original steps illustrated above. If any other Phoronix readers have tips for avoiding interactions with the boot-loader, feel free to comment on this article.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week