Announcement

Collapse
No announcement yet.

21610SA for software raid?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by npcomplete View Post
    The thing about windows benchmarks with fakeraid controllers is that the performance is highly dependent on the driver and nvidia's and intel's windows software raid5 implementation simply suck -- compare those to Highpoint's rocketraid 2300 which is also fakeraid.

    Anyways, I'd try out the 21610sa from a place where it can be returned if you have problems. BTW, it was good to know this card existed.. I wasn't aware such a card existed for this cheap. 16-port on PCI-X 133mhz is probably ok IMO since hard drives under anything but pure sequential I/O are still pretty darn slow.

    An alternative is two supermicro AOC-SAT2-MV8 cards; definitely the card to get for pci(-x). Each is $97 at newegg.




    I've never used a real HW based card on linux so I'm curious as to why this is. Is it an issue with adaptec's driver or something else?
    Well this is a real hardware RAID card, so the benchmarks (which I've read for both Unix and Windows) are pure hardware based. I dug a bit deeper and apparently its that they use a pretty crappy RAID chip. I should qualify that with the fact the performance wasn't horrid, just not great, and not better than software RAID. You get what you pay for I guess.

    I saw that supermicro, which was my other option, but requires about twice the money and twice the PCI slots.

    Thanks for your input!

    Comment


    • #12
      The truth of the matter is that with hardware raid they are not using raid-specific processors.

      They are using generic embedded-style processors in those controllers, and it's very unlikely that those sort of things will come close to the same level of performance that you'd get from a modern CPU.

      That is, to say, Linux software raid can readily outperform cheap hardware raid. Get it?

      It's much more cost effective to spend more money on a faster or more cpu cores then it is to spend it on cheap hardware raid.

      ---------------------------------------

      Also. There is no such thing as cheap hardware raid. If it's cheap it's fake raid.

      ----------------------------------------

      Fakeraid is a Windows-thing. You don't want it. It's created because people want to run RAID on cheap hardware, but Microsoft has their operating systems crippled in such a way that they have software raid disabled except for server versions. So manufacturers filled the need by having proprietary software raid of their own that get installed as 'raid drivers'.

      The biggest reason why you want fakeraid in Linux is if you need to access Windows file systems installed on fakeraid.

      ---------------------------------------------


      The real killer here with software raid isn't CPU performance or overhead it's I/O performance.

      x86 hardware is cheap shit. It has very fast CPUs, but otherwise it's all cheap and slow. And unreliable. That's what makes it cheap. That's why Ext3 is designed the way it is.. it's trying to make unreliable hardware somewhat reliable.

      Expensive hardware like IBM Mainframes don't have this problem. They have relatively slow processors because software licensing for mainframes is based on cpu performance for perverse historical reasons. So people buy as slow as processors as they think they can get away with it. In fact they will intentionally disable processors in their hardware to make software licensing and support costs cheaper. However they have I/O performance on a massive scale. It's all about the I/O. All of that is very expensive and is one of the reasons people pay that much. (the other reasons include very high-quality virtualization.. at least 10-20 years ahead of Vmware.. and the ability to support software that has been under active development for over 30 years).


      Now... This I/O limitation in our cheap commodity hardware that we all use is the killer when it comes to software raid.

      This is why it's nobody who knows what they are doing tries to stuff 20 harddrives in one chassis and expects good performance. You can have 4 CPU cores and that will probably be plenty, but you're system will be bottlenecked on the PCI bus.

      One of the things that hardware RAID can do for you is limit the amount of information that gets transmitted on you PCI bus. All the calculation get performed locally and thus the bus only sees the I/O coming in and out of the raid card, and not the I/O going on to run the RAID itself.

      So that's nice. So you can purchase real hardware raid and stuff lots and lots of drives into a server because your going to saturate your PCI bus with I/O.

      This a positive aspect your not going to see with fakeraid, btw. Fakeraid is generally bad all around.

      Of course it's going to cost you. And cost you a lot. I don't expect that a decent RAID card is going to be cheaper then 250 bucks or so.

      It's all very expensive and irritating.

      ---------------------------------


      One of the nicer aspects of modern hardware is the PCI Express bus. It's a serial technology so that I/O won't saturate the bus, and it has a LOT more I/O speed then old PCI busses.

      So if I was insane and wanted to stuff as many drives into a single machine as possible and cheap as possible, I would probably look at purchasing a computer with a motherboard with as much PCI Express options as possible.

      So you can get boards that have 3 or 4 PCI express ports. 2 for graphics and 1 or two 1x speeds or whatever. I'd then stuff those with 4 port cards if the onboard graphics will allow it. So if your lucky you can get 4 PCI-E ports and get 16 drives that way.

      Then lots of motherboards have lots of SATA ports. I've seen boards with as much as 7 SATA ports.

      REasonably you can expect 4. And you should make sure that you that the drive controllers are PCI Express and that they support Linux.

      And you'll want 2 dual core CPUs for 4 cores.

      Big bonus points if they support AHCI

      So probably 20 drives.

      So.. the configuration would go something like this:

      2x250GB drives for OS and applications. They are mirrored raid with careful attention paid to making sure that the bootloader will work on either drive.

      Then the rest of the drives are 1TB.

      2 drives are hot spares for fail-overs. They are used automatically if another drive craps out.

      Then that leaves 16 1TB drive. It will be in RAID 10. So you have 8 TB worth of storage.


      ---------------------------------

      I'd never do that, of course.

      Personally I would rather go with multiple Linux commodity boxes that are dedicated for storage then trying to stuff a bunch of drives into one big expensive box.

      This way you can take advantage of redundancy to get HA working and get better network performance.

      You can have a dedicated storage network with 1GB ethernet and high quality switches that support trunking and jumbo frames. Then I'd share storage from my dedicated Linux storage boxes to the servers boxes over iSCSI and using a cluster-aware FS like GFS or OCFSv2.

      Linux is unfortunately crippled by bad volume management. Which is why people don't do this more.


      ----------------------------------


      You want to avoid very large volumes. You should use LVM and divide the storage up into slices a leave lots and lots of unused space. This way you can dynamically allocate storage to volumes as it's needed.

      The unfortunately the reality is that large volumes take much longer to recover. So make them small at first and allocate space on a as-needed basis.

      -----------------------------------

      Oh and in between storage capacity vs backups, spend much more money on backups. Backups are much more important then capacity.

      At my work we had a very expensive IBM server. It had 2 hardware RAID5 arrays, mirrored, with a hot-spare for failover. For a total of 10 drives.

      You could lose a grand total of 4 drives without experiencing any data loss.

      One nigh the airconditioning failed. 8 drives died more-or-less simultaneously. 80% failure rate, 100% loss of data.

      There were backups, but it was in the form of MSQL backups and were protected by a key. The only person that had the key was on vacation. On a cruise. In the Caribbean. For a week.

      Not pretty.

      ------------------------------------



      Oh and a few notes on RAID, either hardware or software:

      RAID 5 is obsolete. For small stuff it's fine, for important stuff it's not. RAID 5 is popular because when you have small drives it takes relatively little time to recover a array and thus offered acceptable protection and was cheap.

      Modern drives are much much much larger, but not much faster. The time to rebuild a 5 drive RAID5 array with a single drive failure if your using SCSI 20GB drives can be measured in minutes. The time to rebuild a RAID 5 array with 5 SATA 1000GB drives and a single failure can be measured in hours. The likelihood of having a second drive failure is very high and performance will be bad while it's being rebuilt.


      So you want RAID 10 or RAID 6. Much better.


      ------------------------------------



      This page is very cool:


      It lists SATA controllers and tell you which are Fakeraid and which are Real raid. You want Real raid. It's a very valuable resource. Very good.


      This helps get the point across, too.

      Comment


      • #13
        Originally posted by drag View Post
        Fakeraid is a Windows-thing. You don't want it. It's created because people want to run RAID on cheap hardware, but Microsoft has their operating systems crippled in such a way that they have software raid disabled except for server versions. l
        XP Pro and 2000 both had software raid. End users tended to use the fakeraid controllers instead so it was dropped from Vista.

        Comment


        • #14
          really thanks for sharing it~

          Comment


          • #15
            Originally posted by drag View Post
            One of the nicer aspects of modern hardware is the PCI Express bus. It's a serial technology so that I/O won't saturate the bus, and it has a LOT more I/O speed then old PCI busses.
            What matters is that it's point-to-point, like an ethernet switch, not shared-bus like an old ethernet hub. (Of course, being serial makes this possible without way too many traces, and pins on the chipset, since the chipset has a separate link to each PCIe slot.)

            So you can get boards that have 3 or 4 PCI express ports. 2 for graphics and 1 or two 1x speeds or whatever. I'd then stuff those with 4 port cards if the onboard graphics will allow it.
            Recent desktop chipsets typically have 6 SATA ports, BTW. Even g965 has 6 SATA, and that's 2 years old. I've seen server boards with 8 SAS ports (which can be used as SATA. You can connect SATA drives to SAS ports, but not vice versa.)

            A lot of boards have 3 or more PCIe x1 slots, and yes, you can put PCIe x4 (or anything) SATA controllers into PCIe x16 slots. Note that Intel's IGPs shut down if you use the main PCIe x16 slot, since they dual-purpose it as SDVO/ADD2 (non-PCIe signals that carry DVI to an add-in card which is not a video card but just a connector expansion for the IGP).

            16 SATA port cards that go in a PCIe x8 or x4 slot are available, although they might be more expensive than multiple x1 4 port cards. They would be hardware RAID and would have e.g. 256MB of onboard cache with (optional) battery backing. e.g. 3Ware makes these. http://www.3ware.com/products/serial_ata2-9650.asp. Also Dell's PERC6 (rebadged LSI) cards that they sell with their poweredge servers are just PCIe cards, although the PERC6/e has multilane SAS connectors for hooking up to an external MD1000 (3U, 15 disks). All these big HW RAID controllers will support JBOD mode, where the driver makes each drive visible to Linux, instead of combining them into RAID arrays that the driver tells Linux about.

            3ware's cards are PCIe x4 up to 8 SATA ports, and PCIe x8 for up to 12, 16, and 24 SATA ports. The larger cards have multi-lane connectors, so you might have to buy more expensive cabling. And BTW, 3ware has good Linux drivers last time I checked.

            The 16 port version is 1000$US, the 8 port is 600$US. The 24 port is 1600$US...

            A battery-backup module for any of those cards is 125$.


            newegg.com has a decent selection of cards to look at.
            The cheapest PCIe x4 8-port card is a HighPoint RocketRAID 2320, for 250$US. There are 24 port x8 cards for 900$. Although that only says it does RAID 0, 1, 5, 6, 0+1. Not RAID10. RAID10 (lots of RAID1 pairs that you stripe) is much more reliable than 0+1 (two big RAID0s that you mirror). One drive failure in a RAID 0+1 reduces you to a plain RAID0. If any of them fail, it's dead. And they're the only drives seeing any activity, because the dead RAID0 is dropped from the RAID1.

            4 port SATA cards are mostly PCIe x4, but most cheap mobos only have multiple PCIe x1 slots. I did see one HighPoint x1 card:
            Buy HighPoint RocketRAID 2300 PCI Express SATA II (3.0Gb/s) Controller Card with fast shipping and top-rated customer service. Once you know, you Newegg!

            for 115$US. If it's not fakeraid, the x1 bottleneck could be not as bad if you use HW RAID1 and do MD RAID0 on top of that. The data would only have to go to/from the card once for each pair of drives, not once for each drive. (PCIe version 1 has 250MB/s (full duplex) per x1. Intel chipsets usually have their x16 slot from the northbridge, PCIe version 2, but their other PCIe slots from the southbridge, PCIe version 1.)

            http://www.ncix.com/products/index.p...1215&po=0&ps=2 has 4 SATA PCI cards for 30$CAD. Even one of those would be very limited by the 133MB/s PCI bus. There are 2-port SATA PCIe cards for 25$. The cheapest 4-port SATA PCIe card there is an Adaptec PCIe x4 card, for 134$. There are PCIe x1 4 port SATA cards for 150$.


            If you're willing to modify your motherboard, you can cut out the back of of the PCIe x1 slots and plug any PCIe card you want into them. PCIe devices are required to negotiate how many lanes are available. You don't gain any bandwidth, but you get around the physical incompatibility. You can use and PCIe card as an x1 card, even a video card. (only recommended if you just want extra monitor outputs without much 3D performance. esp. since the x1 slots are usually PCIe version 1.) I've even seen some mobos with open-ended PCIe slots, that would let you plug in anything without filing away the end of the slot. This is the same as using a PCIe x16 video card in a slot that's only x8 or x4 electrically.

            And you'll want 2 dual core CPUs for 4 cores.
            Yeah, probably if you're going to run software RAID5 you'll need all the memory bandwidth you can get. So get a server/workstation mobo w/a 5400X chipset. They have independent busses for the two CPUs, and a snoop filter. (5000X had that, too, but with slower memory allowed).
            Actually, a desktop Nehalem board might work well for soft raid, since they have wicked memory bandwidth.

            Or get a core2duo desktop-style system and spend the extra money on a 16 port HW RAID controller that can do RAID6 or RAID10 in hardware, with onboard cache and a battery backup. RAID6 is more processor intensive than RAID5. If you do use RAID10, then memory bandwidth isn't such an issue, and neither is CPU power.

            So.. the configuration would go something like this:

            2x250GB drives for OS and applications. They are mirrored raid with careful attention paid to making sure that the bootloader will work on either drive.
            You could just as easily use a partition of the big array for the OS + apps. Or if you're running software RAID, make a RAID1 at the beginning of two of the drives for the root FS, another RAID1 at the beginning of another two drives for swap, another for /usr, another for /opt, or whatever you like to do. /opt and /usr will be bigger than the root fs, so maybe put the swap on the same pair as the root fs. You want to take the same amount of space from the start of every drive, so you'll have uniform size partitions to combine in the RAID1.

            If you run HW RAID, it would be ok to put the root fs just on a partition of the main array. You can't combine partitions with different RAID levels with and HW RAID I know of, unlike software RAID. If you are planning to run much on the machine, it might be better to have a separate drive for the root + software, separate from the array you'll store data on. If not, then there won't be a lot of read or write activity to interfere with the RAID.

            You might instead want to use a single disk for the OS, and have a second disk installed to back it up to. It doesn't give you the same high availability of RAID1, but if your backup disk is a mirror (synced nightly instead of constantly) of your main disk, you can boot from it and use it if your main disk fails. More appropriate for a desktop than a server, really, though.

            For the bootloader, grub-install onto each of the component drives of the RAID1. On software RAID, use MD superblock format 1.0, not 1.1 or 1.2, since you need the MD superblock to be at the end of the partition so GRUB just sees a normal FS at the start of the partition.

            I highly recommend XFS for big filesystems. Make sure the start of your partition lines up with a full stripe boundary (since you can only tell XFS the stripe unit and number of stripes, not the offset the partition starts at). e.g. 8 disks in RAID6, 64k stripe size = 6 * 64k stripe width = 384kB = 768 sectors. (I set up a PERC6/e that way, so I remember it.) I made a GPT partition label with parted (DOS partition tables wrap at 2GB). I made my first partition start at sector 768, so it was stripe-aligned. I made a second partition also stripe-aligned, and put XFS on them with mkfs.xfs -d su=64k,sw=6 -l lazy-count=1. I mounted with -o noatime,logbsize=256k,inode64,nobarrier. (nobarrier because PERC6/e has battery-backed cache.)

            See http://www.phoronix.com/forums/showt...4107#post54107 for more about XFS.

            Comment

            Working...
            X