Announcement

Collapse
No announcement yet.

Some Linux Users Are Reporting Software RAID Issues With ASRock Motherboards

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11

    Let me correct you. A simple traditional BIOS does not understand and does not care about partition tables. So it won't notice anything wrong.
    This was absolutely untrue in the PC world. The PC BIOS absolutely did care about partition tables. It required a specific partition layout and tagging on the hard drive to even boot. Specifically the boot partition had to be tagged as "active", had to be the first partition on the drive, and often required to be below the first 1024 cylinders on the drive. There's a reason boot/root partitions on PCs were usually separate, at the beginning of the drive, and small on PC Unix-like installations. It was because BIOSes often couldn't boot anything beyond a certain hard coded limit. Much of the Linux ecosystem's boot quirks are a direct result of its roots in a x86 boot environment and the ways PC BIOS programmers assumed everyone used MS DOS and coded directly for its limitations. (640k RAM limit, 4 primary partitions, extended partitions, CPU ID limits, boot sector limitations, etc).

    Some manufacturers went even further and imposed anti-viral preventative measures (I usually saw McAfee branding for this on certain OEM vendors) inside their BIOS later in the game (by the early 2000s) such that if it detected what the programmers thought were damaged MS DOS partitions (MBR) they would either try to repair it, or toss an error. The most common infection vector for MS DOS/Windows viruses at the time was to infect the boot sectors then hook the drive access interrupt to keep itself hidden and further spread to other drives. While this was arguably good for the 99%+ of their customers using DOS/Windows, for the rest it could cause problems. (Edit here to add once I thought about it: Many OEMs also had an option to block all access to the MBR sector of the drive specifically to block viral boot sector infections.)

    What I'm getting at here, is that OEMs just assuming and purposely only supporting and testing for Microsoft products is historically the norm in the PC world, and that's unlikely to change. This is just more of the same regardless of BIOS v. UEFI merits debate.

    I ran across a bug when I bought my current motherboard that it would drop to the UEFI setup screen when it had more than one boot entry in the FAT32 partition. I reported the bug to the OEM and their response was a version of "Oh, we don't support anything but Windows, sorry." and closed the ticket. They didn't accept the bug till I pointed out that a UEFI bug like this would affect multiple Windows entries as well. Remarkably short sighted of them. But lesson learned: If you find a bug while using a non-Microsoft operating system in a motherboard, do NOT tell the OEM you're using BSD/Linux, they will immediately dismiss you. Instead try to replicate it using Windows. That's generally the only way you'll get their attention, assuming they care to begin with.


    I never really understood why you would want to use a raw device in real hardware (It's useful if you are doing this inside a file mounted as a loop device or something).
    Some databases require a raw device because they impose their own structure on the data stored rather than using the operating system's filesystem(s). Average Tux/Beastie User wouldn't need or want to do this, normally.
    Last edited by stormcrow; 27 November 2018, 03:24 PM.

    Comment


    • #12
      Hey Michael, I am the original reporter.

      Happy answer any questions.

      Many are already answered here by the way, including how the mainboard's behaviour relates to the UEFI specification: https://news.ycombinator.com/item?id=18541493

      It also answers one question from the article: The UEFI of this mainboard is reasonably new, the latest update (which I have installed) is from 2018.

      I see from the pictures that you have a couple more ASRock models around. If you are interested, we could try and find out if these models have the same behaviour (unfortunately I have only this one ASRock mainboard). I think it should be sufficient to first GPT-format 2 disks, and then put a RAID 1 over the entire device. I'm also idling on #phoronix on freenode for questions.

      (Being a happy Phoronix reader for > 10 years, I had originally planned to submit this interesting find to Phoronix after collecting some more info, but it looks like it submitted itself.)

      Comment


      • #13
        Originally posted by stormcrow View Post
        Some databases require a raw device because they impose their own structure on the data stored rather than using the operating system's filesystem(s). Average Tux/Beastie User wouldn't need or want to do this, normally.
        Raw device only means non-formatted, not necessarily non-partitioned. I don't know of anything that requires a raw physical device, only block devices.

        Comment


        • #14
          Originally posted by Serafean View Post
          Raw device only means non-formatted, not necessarily non-partitioned. I don't know of anything that requires a raw physical device, only block devices.
          Oracle Database with Oracle ASM disk can use raw physical device access partitioned is optional.


          There are others out there but Oracle database would be the most common database you would expect to be using a unpartitioned disc directly.


          But it possible for innodb in mysql to be using raw on disc as well. Yes being partitioned again is optional.

          I don't know of any that mandate absolutely raw physical device with no partitions. But like mysql and orcale db here they both allow for operating on drives without partitions. The fact it damages software raid. It would most likely damage database raw harddrives configured the same way as well.

          Comment


          • #15
            Originally posted by stormcrow View Post
            This was absolutely untrue in the PC world. The PC BIOS absolutely did care about partition tables. It required a specific partition layout and tagging on the hard drive to even boot. Specifically the boot partition had to be tagged as "active", had to be the first partition on the drive, and often required to be below the first 1024 cylinders on the drive. There's a reason boot/root partitions on PCs were usually separate, at the beginning of the drive, and small on PC Unix-like installations. It was because BIOSes often couldn't boot anything beyond a certain hard coded limit. Much of the Linux ecosystem's boot quirks are a direct result of its roots in a x86 boot environment and the ways PC BIOS programmers assumed everyone used MS DOS and coded directly for its limitations. (640k RAM limit, 4 primary partitions, extended partitions, CPU ID limits, boot sector limitations, etc).
            Sorry, but that's wrong.

            The PC BIOS only cared about the boot sector (the first sector on the boot device), where the boot loader was stored. Then, the boot loader will search for a bootable partition in the partition table, and load an OS from there. Thus, the PC BIOS only needed a list of devices where to search for a boot sector, everything after launching the boot loader from the boot sector (including loading and interpreting the partition table) was the task of the boot loader.

            In order to consult the partition table and load an OS from the active partition, the boot loader had to use the PC BIOS, which had some helper routines embedded. However, when large drives appeared on the market, the helper routines in the PC BIOS could not handle then properly, and the active partition had to be on the first part of the drive; the rest of the OS could be anywhere, because the OS did not use the helper routines in the PC BIOS, but its own code.

            The PC BIOS did not care about partitions.

            Comment


            • #16
              Originally posted by eduperez View Post
              In order to consult the partition table and load an OS from the active partition, the boot loader had to use the PC BIOS, which had some helper routines embedded. However, when large drives appeared on the market, the helper routines in the PC BIOS could not handle then properly, and the active partition had to be on the first part of the drive; the rest of the OS could be anywhere, because the OS did not use the helper routines in the PC BIOS, but its own code.

              The PC BIOS did not care about partitions.
              Exactly...
              I've been a long time booting my linux systems from a partition within a 128MB zone the BIOS accepted, and once linux started, it didn't matter.
              The only thing the bios cared about is heads,tracks and cylinders, which have been fake since forever, except for the early mfm/rll drives in the starting era of the PC. (They already have been fake for a while in the SMD systems used in enterprises, that's Storage Mass Device for the young uns).

              Comment


              • #17
                Originally posted by starshipeleven View Post
                Is using a partition instead of raw device worse in any way (apart from losing a handfew of MB due to partition table and stuff)?

                I never really understood why you would want to use a raw device in real hardware (It's useful if you are doing this inside a file mounted as a loop device or something).
                Yeah, there is something against partitioning:
                1) which partitioning system?
                PC partitions have a long history of being head/cylinder/sector centric. And any repair on that is just a hack.
                There have been numerous other partitioning schemes long before the pc partition came into play.
                And now we have GPT and PC partitions on a PC, and which one are you going to take, because they are not quite compatible...
                There are still numerous other partitioning schemes for PC if you step outside the world of windows.
                And it really doesn't matter, raw device access should just work. especially if you are using LVM or stuff like that, you don't care about partitions.
                That put aside, I use usually use fdisk or anything like that. Only in fixed systems does it make sense to not partition it, but if you have a bunch of disks lying on a table, a partition table might help you. But again: which type... Original PC will just cost you 1k of your blocks and a lot of MB due to CHS, unless you can do LBA. But with EUFI, gpt seems to be requirement, which also needs an EFI partition?


                Due to my sleepy head I forgot the most important feature of using partitioning for md-raid:
                A good raid setup uses multiple vendor disks. All disks have different rounding sizes.
                A good raid setup therefore has to limit the raid partition to slightly below the lowest encountered disk size, and you use partitions for that.
                When you do that, you can always replace your disks without a hassle.
                So yes, for md-raid you do partitioning to artificially iimit the size of the disk to a predetermined size.
                Last edited by Ardje; 28 November 2018, 08:20 AM.

                Comment


                • #18
                  Originally posted by Ardje View Post
                  1) which partitioning system?
                  PC partitions have a long history of being head/cylinder/sector centric. And any repair on that is just a hack.
                  A partition table is just a table that says that address X to address Y in the drive is occupied by filesystem Z.

                  Afaik there is nothing head/cylinder/sector centric in it, you can decide to use specific addresses for start and end of the partitions if you decide to assume that you are working with rotational drives, and decide to assume what sector size you are working with. But nothing stops you from deciding everything arbitrarily, you may have lower performance if you don't align your filesystem's sectors with the hardware's sectors though.

                  Btw, "sector size" is technically also a thing in NAND-based storage, as there you have erase blocks that are commonly exposed as "sectors" in the FTL.

                  There have been numerous other partitioning schemes long before the pc partition came into play.
                  As I said it's not a particularly complex thing. It's a table.

                  And now we have GPT and PC partitions on a PC, and which one are you going to take, because they are not quite compatible...
                  The main reason I chose GPT on drives that don't specifically require it ( smaller than 2TB) is that GPT has a backup copy of the partition table at the end of the drive, and this feature did save my ass more than once, and is also convenient for other manipulation.
                  MBR partition tables don't.

                  There are still numerous other partitioning schemes for PC if you step outside the world of windows.
                  Some examples? Afaik Linux does not have its own partitioning scheme nor really needs to for PC, while for embedded in 99% of the cases you just use a fixed partition scheme set in the kernel or in the kernel command line (passed from bootloader or whatnot)

                  But again: which type... Original PC will just cost you 1k of your blocks and a lot of MB due to CHS, unless you can do LBA. But with EUFI, gpt seems to be requirement, which also needs an EFI partition?
                  Who goddamn cares about a fem MB in a drive that is multiple GB if not TBs in size.

                  Also UEFI requires GPT + EFI partition only if you want to boot Windows (which is a Windows boot limitation, not UEFI per-se).

                  UEFI can boot a Linux system fine even from a MBR partition (also Windows installers if you prepared the USB drive with tools like rufus), as long as there is a EFI partition with the right flags where it can find the boot loaders/manager/kernel/whatever.



                  Comment


                  • #19
                    Looking at the ASrock forum post, I'm intrigued that the mdadm structure appears as protective MBR + damaged GPT even to gdisk. Is this normal? Heck, gdisk itself strongly recommends to fix this - it's right there in the forum post.
                    This is exactly the reason why I'd never use unpartitioned disks in soft-raid or LVM.

                    Firmware fixing the GPT without user's consent is sketchy, but omitting partitioning scheme is just reckless.

                    Comment


                    • #20
                      Originally posted by myxal View Post
                      Looking at the ASrock forum post, I'm intrigued that the mdadm structure appears as protective MBR + damaged GPT even to gdisk. Is this normal? Heck, gdisk itself strongly recommends to fix this - it's right there in the forum post.
                      That's probably because he did not specifically wipe GPT from the drive.

                      As I said above, GPT has a backup partition table at the end of the partition, which in this case was not deleted before making the RAID and apparently was also not overwritten by RAID data either

                      The following error of gdisk does tell you this. The backup partition table is there and its CRC is valid. It was not wiped properly.
                      Code:
                      Warning! Main partition table CRC mismatch! Loaded backup partition table
                      This GPT feature can cause grief if the drive is re-purposed without proper wiping. The issue is described in gdisk creator's webpage here https://www.rodsbooks.com/gdisk/wipegpt.html

                      afaik the command wipefs -a /dev/sdX will also throughly nuke anything, MBR, GPT, RAID, and any filesystem magic numbers or superblocks that could trigger issues.

                      As will also a full disk wipe where you write 0 to all the drive.

                      nh2_ can you try wiping the backup GPT table from the drives as described above and try again to see if this does not trigger this UEFI's "helper" functionality?

                      Comment

                      Working...
                      X