Announcement

Collapse
No announcement yet.

4-Disk Btrfs Native RAID Performance On Linux 4.10

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Zan Lynx
    replied
    Originally posted by starshipeleven View Post
    Unless Intel in their great understanding decides to add the hotswap feature to the new PCIe revision. After all we have already reached a level where the bandwith is effectively ridicolous even for GPUs so they do need a good reason to keep newer revisions interesting.
    PCIe already does support hot plugging. The problem is that it doesn't work well if you just yank it out. It's like a USB flash drive. If you yank it out while it's writing, who knows what will happen.

    The operating system needs to support it, by providing some kind of GUI (I think there already are command-line tools in Linux and Windows) to show the device visually, so you get the correct one and buttons to turn it off. A fancy server motherboard might have status lights and maybe a button or switch to notify the OS that you're about to pull a card.

    This has some details: https://electronics.stackexchange.co...rk-in-practice

    If you've ever played with a Microsoft Surface Book, you will have seen their clever hardware / software latch that connects the screen and the base. The base has an Nvidia GPU in it and it's connected with PCIe. The switch means that it won't physically disconnect until the software is ready for it.

    Leave a comment:


  • Zucca
    replied
    Originally posted by starshipeleven View Post
    Kernel 4.12 does not "exist" yet, it's still a pile of patches on top of 4.11
    I kinda meant that there isn't a fix yet, unless backporting/patching.

    I can wait for 4.12, since I currently use btrfs-raid10.

    Leave a comment:


  • starshipeleven
    replied
    Originally posted by Zucca View Post
    Isn't the bugfix supposed to be on 4.12 onwards?
    Kernel 4.12 does not "exist" yet, it's still a pile of patches on top of 4.11 https://www.kernel.org/

    He has applied the same patches on top of the most new existing kernel.

    Leave a comment:


  • Zucca
    replied
    Originally posted by starshipeleven View Post
    Seems like there is a guy on mailing list that reports btrfs on kernel 4.11-rc8 is able to scrub and fix correctly (manually-induced) corruption on a RAID5 array. http://www.spinics.net/lists/linux-btrfs/msg64917.html
    Isn't the bugfix supposed to be on 4.12 onwards?

    Leave a comment:


  • starshipeleven
    replied
    Seems like there is a guy on mailing list that reports btrfs on kernel 4.11-rc8 is able to scrub and fix correctly (manually-induced) corruption on a RAID5 array. http://www.spinics.net/lists/linux-btrfs/msg64917.html

    Leave a comment:


  • starshipeleven
    replied
    Originally posted by Zan Lynx View Post
    You can get a PCIe 16x card that has slots for 4 M.2 cards.
    Oh boy, these things have "badass" written all over them. https://www.servethehome.com/the-del...urbo-quad-pro/
    The HP one retails for like 800$ for 512GiB (only 300$ for the 256GiB version) and I'm 100% sure that it won't work with SSDs that aren't from HP.

    Still, I think they either do some weird trick like acting like a single device (i.e. they are basically a RAID card and the mobo is talking to this card, not to the ssds) as afaik it is a total PITA to split PCIe while getting truly high performance too. PMX chips are what is usually used to split PCIe, but a SSD isn't a GPU, a SSD loaded by company-grade applications will saturate the fuck out of its PCIe lines for a long time, the PMX must be truly badass to do true high-performance splitting with that kind of load. GPUs give spike loads usually.

    If you are not a company looking for a deep hole to dump money in, you'd better look for cheap server boards that have buckets of pcie slots, like this http://www.asrockrack.com/general/pr...Specifications with 2 x PCIE3.0 x16, 4 x PCIE3.0 x8, 1 x PCIE2.0 x8 that retails for around 300$
    That board can fit 7 pcie SSDs, still has 10 Sata ports, and has other usual server board features like hardware backd... er I mean light-out remote management system, dual gigabit ethernet, runs registered ecc ram and all that jazz.
    And it's not using PMX chips, that's all native ports.

    Hot swap bays for NVMe cards have to be designed for that.
    Easy, the "hotswap bay" must be a relatively smart device that intercepts these signals and fakes them good enough to close the connection safely or re-start it if a new device is connected. Basically the hardware/OS will be talking to this smart-ish device and not to the SSD itself.

    Buy a Pcie card to install another PCIe card so you can hotswap while you are hotswapping.

    Unless Intel in their great understanding decides to add the hotswap feature to the new PCIe revision. After all we have already reached a level where the bandwith is effectively ridicolous even for GPUs so they do need a good reason to keep newer revisions interesting.
    Last edited by starshipeleven; 05 February 2017, 07:11 PM.

    Leave a comment:


  • Zan Lynx
    replied
    Originally posted by Zucca View Post
    *sigh*
    I'd prefer almost anything standardized but not ones attached directly to the motherboard. Why? Because I like to have redundancy, meaning multiple drives in one PC that you can expand just by replacing (or even hotswap) a drive and synchronize/rebuild/balance the data afterwards.
    For now I use SATA only. M.2 could be used as a boot and/or cache drive. Although I don't know if it's worth as a cache if I currently have all drives SATA SSD.
    M.2 is great even for what you want. You can get a PCIe 16x card that has slots for 4 M.2 cards. There are even server designs intended for 48 NVMe hotswap drives. Supermicro makes one: https://www.supermicro.com/products/...028R-NR48N.cfm

    Doing hot swap on non-server hardware is a lot more difficult, because consumer level desktops and workstations don't expect PCIe devices to come and go. In some cases it has been made to work but isn't common. Microsoft's Surface Book has a hot swap PCIe bus for the Nvidia graphics that are in the base. The Razer Core Thunderbolt 3 graphics dock is another one. Power has to be safely disconnected. The OS has to be instructed to shut down the device, or if it is randomly disconnected, it has to handle the stray signals that were in progress at disconnect, such as half-finished DMA transfers or interrupts that were sent but now have no device to respond to.

    Hot swap bays for NVMe cards have to be designed for that.

    Leave a comment:


  • Zucca
    replied
    Originally posted by starshipeleven View Post
    If you want to cry now I'll understand you.
    *sigh*
    I'd prefer almost anything standardized but not ones attached directly to the motherboard. Why? Because I like to have redundancy, meaning multiple drives in one PC that you can expand just by replacing (or even hotswap) a drive and synchronize/rebuild/balance the data afterwards.
    For now I use SATA only. M.2 could be used as a boot and/or cache drive. Although I don't know if it's worth as a cache if I currently have all drives SATA SSD.

    Leave a comment:


  • starshipeleven
    replied
    Originally posted by Zucca View Post
    I hope NVMe is the future and eventually we'd get ONE standard connector for it.
    The current situation with SATA, SAS, SATAexpress, whatever - is a mess. Many protocols, many connectors. I haven't even bothered to count them all.
    FYI, NVMe uses PCI express lines (the "e" means "express"), so either normal PCI express ports, or through M.2/NGFF ports that expose PCI express lines (the ones with "M" type of M.2 connector seem to be the de-facto standard in PC and laptops because it gives x4 PCIe lines).

    Or Thunderbolt (as it exposes pcie lanes too), and there is work to get a fibre (optic) interface for it too.

    If you want to cry now I'll understand you.

    Leave a comment:


  • starshipeleven
    replied
    Originally posted by jacob View Post
    You just keep missing my point. This is all irrelevant.
    No it is. Block layer and physical layer are disjoint, block layer is a standardized abstraction, physical layer depends from the actual storage technology.
    It's you that keep thinking that the type of storage device has any effect on how its block layer works.
    1. When the block layer initiates a transfer of the 16 blocks, will the SSD indeed send 16 blocks or will it only send blocks A to A+3 (the first PHYSICALLY contiguous extent), after which the OS will have to submit a second DMA request for blocks A+4 to A+7 and so forth?
    It will act as any other block device. If blocks are contiguous they will be transferred together, if not contiguous they will need more DMAs.

    But... it's the filesystem (the OS) that decides if the blocks written on block layer are contiguous or not when writing down stuff, so again I don't understand why you think a SSD should act different.

    2. If the answer to question 1 is yes, that is the SSD can transfer 16 blocks in one op even if they are stored on PHYSICALLY DISJOINT memory cells, is there a performance penalty in this case (slower throughput) compared to if they were all in physically contiguous cells?
    Dunno, what "random access memory" means to you? I'm not throwing it at random.

    It means seek times are the same for any read on any cell. So it won't matter if cells are contiguous or not, performance is the same.

    Due to the fact that flash chips themselves don't have terribly fast read/write speeds, you will actually get performance penalties if the memory cells are contiguous (or disjoint but on the same chip). SSDs are fast because they split cells on different chips.

    Leave a comment:

Working...
X