If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.
Announcement
Collapse
No announcement yet.
Fedora Installer Looks To Change Its BIOS/Fake RAID Handling
Hard to imagine the use case for hardware RAID; it loses on performance, reliability, reproducibility, and features.
But feel free to elaborate.
I'd like to see you do a 400TB+ (20+ x 20TB) RAID 60 using a software only solution. (Be that mdraid, openzfs, etc).
Without battery backed write cache, without management (HP iLO / DELL iDRAC) and do it on *multiple* servers.
(Forgot to add: RAID should be capable of doing 400MB/s sustained writes, for ~3 years, even while RAID is being rebuilt.
Forgot to add 2: Main CPU(s), all 56-128 cores, are @~70-80% load).
Feel free to elaborate...
Last edited by gilboa; 08 December 2022, 11:31 AM.
... and keep in mind that a 20TB HDD price is well under 15USD per TB (in bulk), while mixed use NVME (which may fail due to excessive writes) are capped at ~12TB and priced at a reasonable price of 400-600USD per TB (~20x).
I'd like to see you do a 400TB+ (20+ x 20TB) RAID 60 using a software only solution. (Be that mdraid, openzfs, etc).
Without battery backed write cache, without management (HP iLO / DELL iDRAC) and do it on *multiple* servers.
(Forgot to add: RAID should be capable of doing 400MB/s sustained writes, for ~3 years, even while RAID is being rebuilt.
Forgot to add 2: Main CPU(s), all 56-128 cores, are @~70-80% load).
Feel free to elaborate...
I can do 300TB RAID60 in a single chassis with IOPS that you cannot match. The limitation here, of course, is that I can't find a chassis with more than 24 slots.
Parts:
Whatever the current dell 7425 equivalent is, configured for 24 NVMe U.2 slots
24x Micron 9300 15.36 TB NVMe (U.2)
That's it: no additional backplanes, RAID cards, etc
2 RAID 6 arrays (via mdadm) at 12 drives each = 150TB per array, striping both arrays together for 300TB usable.
IOPS are hard to predict, but are probably kernel / CPU limited, and probably somewhere north of 1 million IOPs before tuning with under 10% CPU usage. Sequential bandwidth very likely north of 40GB/s. I've done similar (half sized) builds that hit 500k IOPS real world, before any serious tuning. Total bill is going to be ~$80k / server, or $266 / TB usable.
Good luck finding a RAID card to do this: you can't because a single PCIe slot does not have the bandwidth, and no dinky SOC or its dinky 1GB of DRAM is going to be able to keep up.
while mixed use NVME (which may fail due to excessive writes) are capped at ~12TB and priced at a reasonable price of 400-600USD per TB (~20x).
No, they aren't. Look up MTFDHAL15T3TDP-1AT1ZAB, which is 15.36TB, has been around for nearly 5 years, is rated for 1DWPD for 5 years. It costs under $3k, so roughly $200/TB. They do have a MAX product (MTFDHAL12T8TDR-1AT1ZABYY) rated for 3DWPD that's $4k for 12.8TB-- still well under your 400-600 estimate.
And it's massive, enormous advantage over an HDD is:
* No need for RAID card: it's right on the PCIe bus
* No need for mirroring RAID levels: rebuilds are so fast you don't have to use 1x or x1 RAID modes that burn yet more disks to cope with the dangerous rebuild times of RAID6 on rotational
* No need for extensive caches and batteries: the write speed is so high that you can just commit straight to disk
You cut out a ton of architectural complexity; I'm surprised you have the guts to recommend a RAID60 solution with HDDs. Your rebuild times on either array's failure will be measured in weeks, especially if you experience a second failure during that time. And it's frankly dishonest to even make the comparison, given the massive differences in latency, throughput, and IOPS.
Last edited by ll1025; 19 December 2022, 02:22 PM.
I'm trying to understand how you can possible compare a 20-30K USD solution to 80+K USD solution, and this assuming a 1DWPD SSD/NVME will survive more than 1 year in a write only environment (as noted above).
More-ever, how many machine (with said parameters) have you put into production? for how long?
Last time I checked, I have dozens of R730xd / R740xd (with or w/o MD1400s) machines with 100-300TB with the dreaded and unreliable RAID60) working 24/7 for years in 99.9% write environment, some (the R730s) are 5 years old.
Oh yeah, an we kill far more mixed use SSDs (3 DWPD, that usually last 2-3 years before crapping out) than HDDs.
NOTE: You make the wrong assumption that everybody needs the highest IOPs and/or highest BW in their big-storage system, and that's dead wrong. I don't need higher IOPs, I don't need higher BW. But raise the price by 50%, and I lose my pants.
Hence I use HDDs. Hence I use RAID controllers.
- Gilboa
Last edited by gilboa; 25 December 2022, 07:00 AM.
I can do 300TB RAID60 in a single chassis with IOPS that you cannot match. The limitation here, of course, is that I can't find a chassis with more than 24 slots.
Comment