Announcement

Collapse
No announcement yet.

GRUB Now Supports Btrfs 3/4-Copy RAID1 Profiles (RAID1C3 / RAID1C4 On Linux 5.5+)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • waxhead
    replied
    Originally posted by Zan Lynx View Post
    RAID1 is worse for parallel reads because the chunks are not adjacent. If you are using 64 KB chunks and 4 stripes then disk 1 reads [0-64K], [256K-320K], disk 2 [64K-128K], [320K-384K], etc, etc. You are wasting 1/4th of your drive read bandwidth.
    Got it - for traditional raid you are right , but in BTRFS case you have to remember that it is a filesystem and volume manager in once package. This means that in theory nothing (except the offset on disk) stops you from "stripe reading" from a 4 copy RAID1 *if* the disks are idle, or servicing other things based on priority such as writes and/or other reads since BTRFS is a filesystem that knows what file it reads.

    It all depends on the workload, for sequential read RAID10 usually is good performance wise, but for (BTRFS') RAID1c3/4 the filesystem could basically tune itself to boost either random read/write or sequential reads as it is free to choose what to use the disks for depending on the workload. E.g. it could read more data from one drive and less on another depending on workload or even disk speed.

    Of course BTRFS is not that advanced (yet) , but due to not being block layer RAID it has endless possibilities at least theoretically. If it ever will get features like this from a practical point of view remains to be seen. The same should (or could possibly) be true for similar filesystems such as bcachefs or ZFS for example.

    Yes I understand the problem about adjacent blocks, but again BTRFS is in a position where it can be free to choose optimization strategy based on what it knows. And if I am not mistaking most disks use address translation which does not necessarily maps logical addresses to physical addresses anyway so the chunks are probably not adjacent anyway which would introduce a potential small seek delay individually for each disk in the pool.


    Leave a comment:


  • starshipeleven
    replied
    Originally posted by Zan Lynx View Post
    RAID1 is worse for parallel reads because the chunks are not adjacent. If you are using 64 KB chunks and 4 stripes then disk 1 reads [0-64K], [256K-320K], disk 2 [64K-128K], [320K-384K], etc, etc. You are wasting 1/4th of your drive read bandwidth.
    I'm not sure I understand what you are doing here.

    You have 4 copies of the same data (the 4 "stripes" means this is a RAID1c4, a classic RAID1 has only 2 "stripes" when reading afaik) with only 2 drives, it's superfluous as you have no use for 2 copies of the data on the same drive in a multi-drive array.

    In a best case scenario (btrfs detects this and reads only 2 stripes instead of 4) you are not really getting any performance change, in a worst case scenario (btrfs does not detect this) you swamp performance.

    I was implicitly assuming each drive in the array has AT MOST a single full copy of the array's data, possibly less.
    i.e. 2+ drives with a RAID1, 3+ drives with a RAID1c3, 4+ drives with a RAID1c4.

    Leave a comment:


  • Zan Lynx
    replied
    RAID1 is worse for parallel reads because the chunks are not adjacent. If you are using 64 KB chunks and 4 stripes then disk 1 reads [0-64K], [256K-320K], disk 2 [64K-128K], [320K-384K], etc, etc. You are wasting 1/4th of your drive read bandwidth.

    Leave a comment:


  • waxhead
    replied
    Originally posted by starshipeleven View Post
    Would be interesting to see a performance comparison. RAID10 will of course scale better as with each drive pair you add things will change, but for small and medium sized setups these two new profiles should be competitive.
    I actually think that BTRFS "RAID"1c4 in many scenarios could scale better than RAID10 either it being BTRFS' native implementation or regular RAID10. In RAID10 like setups you have to make half the storage devices read it's part of the stripe at the same time, but with "RAID"1c3/4 you still just need to read from one device, leaving other devices free to service other reads or writes - this could especially be useful for storage devices with larger latency such as harddrives or even network block devices.

    Again the workload would matter immensely, but theoretically it could rival RAID10 on performance and smoke it on reliability as well. Any RAID10 setup can potentially fail if you are unlucky enough to loose the "right" two disks.

    Leave a comment:


  • starshipeleven
    replied
    Originally posted by Zan Lynx View Post
    RAID10 already does multiple reads in btrfs.
    Yes but we are talking of RAID1 and RAID1-like (i.e. pure mirroring) setups with more than 2 copies, which is a different setup from RAID10

    RAID10 is mirroring and striping, and is "nested" raid. And its fault tolerance is weird and depends from the array size (the bigger the array the more drives it can lose, you can always lose a single drive and be fine) and some luck, while with a RAID1C4 profile you will always be able to lose up to 3 drives and be fine.

    Would be interesting to see a performance comparison. RAID10 will of course scale better as with each drive pair you add things will change, but for small and medium sized setups these two new profiles should be competitive.

    Leave a comment:


  • Zan Lynx
    replied
    RAID10 already does multiple reads in btrfs. A six drive RAID10 will read three drives at a time.

    Leave a comment:


  • starshipeleven
    replied
    Originally posted by pal666 View Post
    it should be possible to distribute chunks(and reads) to all drives with any redundancy level
    Yes but if you have 3 or 4 copies of the same data you can read from 3 or 4 drives at the same time instead than just from 2 (with normal RAID1).

    Because no matter how you distribute them, you will have only 2 copies of the data you can read from in RAID1.

    Leave a comment:


  • pal666
    replied
    Originally posted by waxhead View Post
    Except for improved redundancy the real cool thing about BTRFS Raid1c4 for example will in time be that you will get a impressive improvement for parallell reads in time when btrfs learn to distribute reads to the least busy/fastest device in the storage pool
    it should be possible to distribute chunks(and reads) to all drives with any redundancy level

    Leave a comment:


  • waxhead
    replied
    Except for improved redundancy the real cool thing about BTRFS Raid1c4 for example will be that you will get a impressive improvement for parallell reads in the future when btrfs learn to distribute reads to the least busy/fastest device in the storage pool
    Last edited by waxhead; 07 December 2019, 12:54 PM.

    Leave a comment:


  • starshipeleven
    replied
    There is also in the works a "btrfs will write to a degraded array without creating SINGLE chunks" so that when you replace the dead drive you don't need to do a balance before the scrub.
    https://patchwork.kernel.org/cover/11231915/

    Leave a comment:

Working...
X