Announcement

Collapse
No announcement yet.

Btrfs RAID vs. Linux Software RAID Benchmarks On Linux 4.7

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Btrfs RAID vs. Linux Software RAID Benchmarks On Linux 4.7

    Phoronix: Btrfs RAID vs. Linux Software RAID Benchmarks On Linux 4.7

    Earlier this month I carried out some 4-disk Btrfs RAID benchmarks using four SATA 3.0 SSDs. Those tests were done using the Btrfs built-in RAID capabilities while today are some comparison tests against those numbers when using the Linux Software RAID setup via mdadm.

    http://www.phoronix.com/vr.php?view=23435

  • waxhead
    replied
    Originally posted by starshipeleven View Post
    Sorry but you cannot just throw this here without explaining. I'm curious. Please explain.

    I happen to be following the btrfs mailing list and I'm not seeing him claiming this, can you point out the mail?
    I know btrfs complains loudly on dmesg about I/O failures and spams logs about lost disks (which is imho ok) if a disk has been removed, but it does not make anything worse.

    Again, I cannot see this (any systemd-related issue) in the mailing list in like the last month. I get a guy with what seems like a GRUB config issue.

    I can boot fine if I remove a drive of a btrfs RAID1 array with rEFInd (for obvious reasons it's on another drive, a USB flash drive serving as UEFI partition and recovery OS partition), as it autoscans stuff on boot and isn't hard-coded like grub.

    Hell, I remember pretty clearly that Grub had issues booting even with mdadm raid if you remove the drive it is usually booting from.
    Ok the issues here was.

    1. Autodefrag have issues too...

    Here you go: https://wiki.debian.org/Btrfs

    Look under recommendations for rotational harddisks.... you will find the following....
    "Consider revoking this recommendation, because autodefrag, like -o discard, can trigger buggy behaviour. Also consider revoking the compress=lzo recommendation for rotational disks, because while it increases throughput for sequentially written compressible data, it also magnifies fragmentation...which means lots more seeks and increased latency -- NicholasDSteeves"

    2. Mailing list post where Chris Murphy says that BTRFS does not have a concept of failed drives...

    Here you go: http://www.spinics.net/lists/linux-btrfs/msg57999.html
    ...and in all fairness - they are working on it... : http://www.spinics.net/lists/linux-btrfs/msg56741.html

    3. Mailing list post where a guy is struggling booting with systemd where one disk have gone bad...

    Well, this is embarrassing isn't it... I am not able to find that post so I can't back up this one with any evidence. If you *really* want it I can try to look harder.
    I can't remember the exact details in that post , but this may absolutely have more to do with the glue around btrfs than btrfs itself.

    Leave a comment:


  • starshipeleven
    replied
    Originally posted by duby229 View Post
    Why would you put the boot partition on an array?
    Because I don't have any good reason not to. The boot loader can read and load it just fine. I'm of course talking of software raid.

    As far as I'm aware it can only be done on RAID1
    In the distant past, you needed --metadata=0.9 on a RAID1 yes and it was flimsy as I said above, as GRUB was hard-coded to a single drive unless you did something to it to allow it to boot from either of the drives in the array.

    GRUB 2 is fully RAID0/1/5/6/10, LUKS and LVM-aware https://wiki.gentoo.org/wiki/GRUB2#Extended_features

    rEFInd's btrfs driver can also read btrfs RAID1 arrays (like mine) fine, but rEFInd is otherwise non-RAID-aware. http://www.rodsbooks.com/refind/drivers.html

    and is rarely ever done.
    Wrong. If people can skip useless partitioning, they do.
    Even with --metadata=0.9 you can make a mdadm RAID1 up to 2TB of size, so it never made any sense to keep /boot on a single partition just for lulz.

    Seriously man , even my NAS with a Kirkwood SoC is booting off a --metadata=0.9 RAID1 system partition.

    Leave a comment:


  • Ardje
    replied
    Originally posted by duby229 View Post
    I think somehow you misunderstood the intention of my post. An array of disks is intended to keep a machine functional in case of drive failure. It has nothing to do with data backup and I didn't imply that at all. Otherwise I very much agree with the rest of your post.
    The quooting is off, I know ... It was not targeted at you, but more in general ;-). But the first post about btrfs derailed in a "you are doing it wrong you should backup", while, as you and I know, redundancy discussions is not about backup, but about uptime.
    I do hope btrfs will get to a point that I can trust personal data to it (yeah, of course with a redundant machine setup). Because the most important thing I think is the checksumming of the file data. A scrub will check the integrity of the data. And if it is raid 1 it can also repair if necessary, which makes a redundant setup also more reliable.

    Leave a comment:


  • duby229
    replied
    Originally posted by starshipeleven View Post
    Sorry but you cannot just throw this here without explaining. I'm curious. Please explain.

    I happen to be following the btrfs mailing list and I'm not seeing him claiming this, can you point out the mail?
    I know btrfs complains loudly on dmesg about I/O failures and spams logs about lost disks (which is imho ok) if a disk has been removed, but it does not make anything worse.

    Again, I cannot see this (any systemd-related issue) in the mailing list in like the last month. I get a guy with what seems like a GRUB config issue.

    I can boot fine if I remove a drive of a btrfs RAID1 array with rEFInd (for obvious reasons it's on another drive, a USB flash drive serving as UEFI partition and recovery OS partition), as it autoscans stuff on boot and isn't hard-coded like grub.

    Hell, I remember pretty clearly that Grub had issues booting even with mdadm raid if you remove the drive it is usually booting from.
    Why would you put the boot partition on an array? As far as I'm aware it can only be done on RAID1 and is rarely ever done. It doesn't make sense.

    Leave a comment:


  • starshipeleven
    replied
    Originally posted by waxhead View Post
    Oh yes, I have tried both with and without. Autodefrag have it's own issues too.
    Sorry but you cannot just throw this here without explaining. I'm curious. Please explain.

    Even today there was a post on the BTRFS mailing list where Chris Murphy states that BTRFS (still) does not have a concept for failed drives, and this situation (one removed drive) can introduce more problems.
    I happen to be following the btrfs mailing list and I'm not seeing him claiming this, can you point out the mail?
    I know btrfs complains loudly on dmesg about I/O failures and spams logs about lost disks (which is imho ok) if a disk has been removed, but it does not make anything worse.

    There is a guy struggeling to boot with systemd where one disk have gone bad - so as I have said in numerous posts BTRFS is cool, but they really need to get more of the very basics working
    Again, I cannot see this (any systemd-related issue) in the mailing list in like the last month. I get a guy with what seems like a GRUB config issue.

    I can boot fine if I remove a drive of a btrfs RAID1 array with rEFInd (for obvious reasons it's on another drive, a USB flash drive serving as UEFI partition and recovery OS partition), as it autoscans stuff on boot and isn't hard-coded like grub.

    Hell, I remember pretty clearly that Grub had issues booting even with mdadm raid if you remove the drive it is usually booting from.

    Leave a comment:


  • waxhead
    replied
    Originally posted by starshipeleven View Post
    Did you use the autodefrag mount option? It should avoid fragmentation by autostarting defragging when it detects a fragmented file.
    Oh yes, I have tried both with and without. Autodefrag have it's own issues too. Even today there was a post on the BTRFS mailing list where Chris Murphy states that BTRFS (still) does not have a concept for failed drives, and this situation (one removed drive) can introduce more problems. There is a guy struggeling to boot with systemd where one disk have gone bad - so as I have said in numerous posts BTRFS is cool, but they really need to get more of the very basics working

    Leave a comment:


  • duby229
    replied
    Originally posted by Ardje View Post
    It shocks me how many people confuse uptime of a machine with backup.
    You want disk redundancy so the machine keeps running. Depending on your backup solutions it might take a month to restore a backup. Yes, I've been there, advising people to forget about backups, if the restore times takes longer than the life time of an object on the machine, and there is no budget for machine dependency, just forget about backups, and make sure that .
    Fortunately scale has increased to afford a CDN setup, but there are a lot of cases that uptime of a machine (and hence the number of redundant disks) is more important than the backup of the machine.
    And as inclined as you are to say that the backup solution must suck, in that case you are just not as experienced with the different mass data problems as I am ;-).
    So:
    1) backups are a "total disaster happened we must restore functionality within the next 24 hours" solution depending on the amount of data. You actually need to test a restore procedure to see if you can restore services within the window people think it will happen.
    2) redundancy (disks and machine wise): we must prevent as economically as possible of a total disaster from happening. You must take into account that if a disaster happens, you need part of the performance of the redundancy to restore redundancy to a higher level of redundancy.
    3) archives: archives are archives (of data). They are not backups. People tend to think that backups are a form of archiving. No, if you need an archive, you need to design that in your application infra structure. You might want to backup an archive or just use redundancy, all depending on how important the archive is. Usually archives are mostly needed by law to index and retrieve your sales records.
    What most people also tend to forget is to stamp a volatility on their data. Usually you are required by law to forget all about a person after a certain, time, except for financial records, which also should be forgotten, but only after the financial record laws have expired them.

    But please, never ever mention backup again when we are talking redundancy, because they are very different things. I might calculate for a redundancy of 3 disks. Btrfs does not support that. Too bad, no btrfs for me.
    To be clear: I use btrfs myself on bcache on FCOE vn2vn. I like btrfs. And if you are using a dirvish backup, btrfs is very handy as your backup (if you have a redundant backup) filesystem, as it can clone your rsync directory, which is so much more effective than hardlinking files.
    I think somehow you misunderstood the intention of my post. An array of disks is intended to keep a machine functional in case of drive failure. It has nothing to do with data backup and I didn't imply that at all. Otherwise I very much agree with the rest of your post.

    Leave a comment:


  • Ardje
    replied
    Originally posted by duby229 View Post
    I fully agree with you. An array of disks is meant so that machine stays functioning on drive failure. Data backup is something totally different. It shocks me how many people implement RAID as their only means of data protection.
    It shocks me how many people confuse uptime of a machine with backup.
    You want disk redundancy so the machine keeps running. Depending on your backup solutions it might take a month to restore a backup. Yes, I've been there, advising people to forget about backups, if the restore times takes longer than the life time of an object on the machine, and there is no budget for machine dependency, just forget about backups, and make sure that .
    Fortunately scale has increased to afford a CDN setup, but there are a lot of cases that uptime of a machine (and hence the number of redundant disks) is more important than the backup of the machine.
    And as inclined as you are to say that the backup solution must suck, in that case you are just not as experienced with the different mass data problems as I am ;-).
    So:
    1) backups are a "total disaster happened we must restore functionality within the next 24 hours" solution depending on the amount of data. You actually need to test a restore procedure to see if you can restore services within the window people think it will happen.
    2) redundancy (disks and machine wise): we must prevent as economically as possible of a total disaster from happening. You must take into account that if a disaster happens, you need part of the performance of the redundancy to restore redundancy to a higher level of redundancy.
    3) archives: archives are archives (of data). They are not backups. People tend to think that backups are a form of archiving. No, if you need an archive, you need to design that in your application infra structure. You might want to backup an archive or just use redundancy, all depending on how important the archive is. Usually archives are mostly needed by law to index and retrieve your sales records.
    What most people also tend to forget is to stamp a volatility on their data. Usually you are required by law to forget all about a person after a certain, time, except for financial records, which also should be forgotten, but only after the financial record laws have expired them.

    But please, never ever mention backup again when we are talking redundancy, because they are very different things. I might calculate for a redundancy of 3 disks. Btrfs does not support that. Too bad, no btrfs for me.
    To be clear: I use btrfs myself on bcache on FCOE vn2vn. I like btrfs. And if you are using a dirvish backup, btrfs is very handy as your backup (if you have a redundant backup) filesystem, as it can clone your rsync directory, which is so much more effective than hardlinking files.

    Leave a comment:


  • dimko
    replied
    very interesting article. thank you.

    Leave a comment:

Working...
X