very interesting article. thank you.
Announcement
Collapse
No announcement yet.
Btrfs RAID vs. Linux Software RAID Benchmarks On Linux 4.7
Collapse
X
-
Originally posted by duby229 View PostI fully agree with you. An array of disks is meant so that machine stays functioning on drive failure. Data backup is something totally different. It shocks me how many people implement RAID as their only means of data protection.
You want disk redundancy so the machine keeps running. Depending on your backup solutions it might take a month to restore a backup. Yes, I've been there, advising people to forget about backups, if the restore times takes longer than the life time of an object on the machine, and there is no budget for machine dependency, just forget about backups, and make sure that .
Fortunately scale has increased to afford a CDN setup, but there are a lot of cases that uptime of a machine (and hence the number of redundant disks) is more important than the backup of the machine.
And as inclined as you are to say that the backup solution must suck, in that case you are just not as experienced with the different mass data problems as I am ;-).
So:
1) backups are a "total disaster happened we must restore functionality within the next 24 hours" solution depending on the amount of data. You actually need to test a restore procedure to see if you can restore services within the window people think it will happen.
2) redundancy (disks and machine wise): we must prevent as economically as possible of a total disaster from happening. You must take into account that if a disaster happens, you need part of the performance of the redundancy to restore redundancy to a higher level of redundancy.
3) archives: archives are archives (of data). They are not backups. People tend to think that backups are a form of archiving. No, if you need an archive, you need to design that in your application infra structure. You might want to backup an archive or just use redundancy, all depending on how important the archive is. Usually archives are mostly needed by law to index and retrieve your sales records.
What most people also tend to forget is to stamp a volatility on their data. Usually you are required by law to forget all about a person after a certain, time, except for financial records, which also should be forgotten, but only after the financial record laws have expired them.
But please, never ever mention backup again when we are talking redundancy, because they are very different things. I might calculate for a redundancy of 3 disks. Btrfs does not support that. Too bad, no btrfs for me.
To be clear: I use btrfs myself on bcache on FCOE vn2vn. I like btrfs. And if you are using a dirvish backup, btrfs is very handy as your backup (if you have a redundant backup) filesystem, as it can clone your rsync directory, which is so much more effective than hardlinking files.
Comment
-
Originally posted by Ardje View PostIt shocks me how many people confuse uptime of a machine with backup.
You want disk redundancy so the machine keeps running. Depending on your backup solutions it might take a month to restore a backup. Yes, I've been there, advising people to forget about backups, if the restore times takes longer than the life time of an object on the machine, and there is no budget for machine dependency, just forget about backups, and make sure that .
Fortunately scale has increased to afford a CDN setup, but there are a lot of cases that uptime of a machine (and hence the number of redundant disks) is more important than the backup of the machine.
And as inclined as you are to say that the backup solution must suck, in that case you are just not as experienced with the different mass data problems as I am ;-).
So:
1) backups are a "total disaster happened we must restore functionality within the next 24 hours" solution depending on the amount of data. You actually need to test a restore procedure to see if you can restore services within the window people think it will happen.
2) redundancy (disks and machine wise): we must prevent as economically as possible of a total disaster from happening. You must take into account that if a disaster happens, you need part of the performance of the redundancy to restore redundancy to a higher level of redundancy.
3) archives: archives are archives (of data). They are not backups. People tend to think that backups are a form of archiving. No, if you need an archive, you need to design that in your application infra structure. You might want to backup an archive or just use redundancy, all depending on how important the archive is. Usually archives are mostly needed by law to index and retrieve your sales records.
What most people also tend to forget is to stamp a volatility on their data. Usually you are required by law to forget all about a person after a certain, time, except for financial records, which also should be forgotten, but only after the financial record laws have expired them.
But please, never ever mention backup again when we are talking redundancy, because they are very different things. I might calculate for a redundancy of 3 disks. Btrfs does not support that. Too bad, no btrfs for me.
To be clear: I use btrfs myself on bcache on FCOE vn2vn. I like btrfs. And if you are using a dirvish backup, btrfs is very handy as your backup (if you have a redundant backup) filesystem, as it can clone your rsync directory, which is so much more effective than hardlinking files.
- Likes 2
Comment
-
Originally posted by starshipeleven View PostDid you use the autodefrag mount option? It should avoid fragmentation by autostarting defragging when it detects a fragmented file.
http://www.dirtcellar.net
Comment
-
Originally posted by waxhead View PostOh yes, I have tried both with and without. Autodefrag have it's own issues too.
Even today there was a post on the BTRFS mailing list where Chris Murphy states that BTRFS (still) does not have a concept for failed drives, and this situation (one removed drive) can introduce more problems.
I know btrfs complains loudly on dmesg about I/O failures and spams logs about lost disks (which is imho ok) if a disk has been removed, but it does not make anything worse.
There is a guy struggeling to boot with systemd where one disk have gone bad - so as I have said in numerous posts BTRFS is cool, but they really need to get more of the very basics working
I can boot fine if I remove a drive of a btrfs RAID1 array with rEFInd (for obvious reasons it's on another drive, a USB flash drive serving as UEFI partition and recovery OS partition), as it autoscans stuff on boot and isn't hard-coded like grub.
Hell, I remember pretty clearly that Grub had issues booting even with mdadm raid if you remove the drive it is usually booting from.
Comment
-
Originally posted by starshipeleven View PostSorry but you cannot just throw this here without explaining. I'm curious. Please explain.
I happen to be following the btrfs mailing list and I'm not seeing him claiming this, can you point out the mail?
I know btrfs complains loudly on dmesg about I/O failures and spams logs about lost disks (which is imho ok) if a disk has been removed, but it does not make anything worse.
Again, I cannot see this (any systemd-related issue) in the mailing list in like the last month. I get a guy with what seems like a GRUB config issue.
I can boot fine if I remove a drive of a btrfs RAID1 array with rEFInd (for obvious reasons it's on another drive, a USB flash drive serving as UEFI partition and recovery OS partition), as it autoscans stuff on boot and isn't hard-coded like grub.
Hell, I remember pretty clearly that Grub had issues booting even with mdadm raid if you remove the drive it is usually booting from.
Comment
-
Originally posted by duby229 View PostI think somehow you misunderstood the intention of my post. An array of disks is intended to keep a machine functional in case of drive failure. It has nothing to do with data backup and I didn't imply that at all. Otherwise I very much agree with the rest of your post.
I do hope btrfs will get to a point that I can trust personal data to it (yeah, of course with a redundant machine setup). Because the most important thing I think is the checksumming of the file data. A scrub will check the integrity of the data. And if it is raid 1 it can also repair if necessary, which makes a redundant setup also more reliable.
Comment
-
Originally posted by duby229 View PostWhy would you put the boot partition on an array?
As far as I'm aware it can only be done on RAID1
GRUB 2 is fully RAID0/1/5/6/10, LUKS and LVM-aware https://wiki.gentoo.org/wiki/GRUB2#Extended_features
rEFInd's btrfs driver can also read btrfs RAID1 arrays (like mine) fine, but rEFInd is otherwise non-RAID-aware. http://www.rodsbooks.com/refind/drivers.html
and is rarely ever done.
Even with --metadata=0.9 you can make a mdadm RAID1 up to 2TB of size, so it never made any sense to keep /boot on a single partition just for lulz.
Seriously man , even my NAS with a Kirkwood SoC is booting off a --metadata=0.9 RAID1 system partition.
Comment
-
Originally posted by starshipeleven View PostSorry but you cannot just throw this here without explaining. I'm curious. Please explain.
I happen to be following the btrfs mailing list and I'm not seeing him claiming this, can you point out the mail?
I know btrfs complains loudly on dmesg about I/O failures and spams logs about lost disks (which is imho ok) if a disk has been removed, but it does not make anything worse.
Again, I cannot see this (any systemd-related issue) in the mailing list in like the last month. I get a guy with what seems like a GRUB config issue.
I can boot fine if I remove a drive of a btrfs RAID1 array with rEFInd (for obvious reasons it's on another drive, a USB flash drive serving as UEFI partition and recovery OS partition), as it autoscans stuff on boot and isn't hard-coded like grub.
Hell, I remember pretty clearly that Grub had issues booting even with mdadm raid if you remove the drive it is usually booting from.
1. Autodefrag have issues too...
Here you go: https://wiki.debian.org/Btrfs
Look under recommendations for rotational harddisks.... you will find the following....
"Consider revoking this recommendation, because autodefrag, like -o discard, can trigger buggy behaviour. Also consider revoking the compress=lzo recommendation for rotational disks, because while it increases throughput for sequentially written compressible data, it also magnifies fragmentation...which means lots more seeks and increased latency -- NicholasDSteeves"
2. Mailing list post where Chris Murphy says that BTRFS does not have a concept of failed drives...
Here you go: http://www.spinics.net/lists/linux-btrfs/msg57999.html
...and in all fairness - they are working on it... : http://www.spinics.net/lists/linux-btrfs/msg56741.html
3. Mailing list post where a guy is struggling booting with systemd where one disk have gone bad...
Well, this is embarrassing isn't it... I am not able to find that post so I can't back up this one with any evidence. If you *really* want it I can try to look harder.
I can't remember the exact details in that post , but this may absolutely have more to do with the glue around btrfs than btrfs itself.
http://www.dirtcellar.net
Comment
Comment