Announcement

**sdack** · 21 December 2021, 07:50 PM

Originally posted by Eumaios View Post

... how do you check the data, to make sure that the backups that you make are not corrupted? That is, I can easily imagine backing up data for redundancy and finding out later that bit rot was creeping up even on my backups, so that I have no good data any more from which to restore.

So what good did your checksum do? It did not fix the data, did it? So what is the difference between knowing your data might possibly be corrupted, and actually retrieving the data, using it, and seeing it for yourself? The difference for me is that the latter is something I have to do anyway to actually know, and thus can I just skip the checksum altogether.

The point really is that a check without a reactive measure (to fix the possible data loss) is not very useful. It is like a security guard without a gun and no right to stop you. The usefulness comes from the reactive measure, i.e. a repair, but not from the checksum itself. A checksum is a means to avoid repairing the data every single time you retrieve it, and only to repair it when a corruption has likely occured. It is an optimization to avoid too much redundancy, but as a check on its own is it not all this helpful.

**sdack** · 21 December 2021, 08:04 PM

Originally posted by useless View Post

All in all, trouble-free checksumming allows regular users to notice corruption and take action. If you have automatic scrubs (which you should) you can prevent more damage soon enough and lose less recently generated data.

Like I said, there is no such thing as "trouble-free" checksumming. A checksum is a trouble indicator if anything. I still only see you wanting to be right, but I do not see you understanding when you do not need it. The fact that you seem to be using it all the time, like one could not ever check too much or that multiple checks could not be redundant is not really helping you to get your point across. You are just throwing a lot of checksumming around like it was chocolate sprinkles.

**caligula** · 22 December 2021, 12:44 AM

Originally posted by microcode View Post

No; zstd was released in 2015, and was developed by Yann Collet, a Facebook employee.

My point was, zstd was released in Jan 2015, he started working at FB in June 2015. FB didn't invent anything, the author did.

**yoshi314** · 22 December 2021, 07:22 AM

Originally posted by timofonic View Post

I never use Facebook nor Instagram or any of their products and consider them very evil (and fatal for teenagers), but Zstd is their only good product.

Btrfs is a total failure.

Congratulations, Facebook = META. You did something good, finally

in case you don't know - there is a blog ran by people who vehemently oppose zstd, because facebook. i just don't get it. it's a good tool that benefits everyone.

**AlexFonewn** · 22 December 2021, 08:25 AM

Originally posted by zxy_thf View Post

Btrfs is a nice product when used wisely.
It saved me from memory failure by mandating checksums on data.Otherwise I would suffer from silent data corruptions.
Now I'm using it whenever possible due to the fears of future hardware problems. (But I never use its fancy but counter-intuitive features).

Please elaborate on that. I am very interested in being protected against storage data corruption. I has happened to me once already, luckily I was also storing dated-backups and could find the original un-corrupted original file. (the corrupted file had made it into several of the dated-backups, and of course into the "live" backup, the one I maintain as current, with rsync.)

How are you using Btrfs for that?

I am not a super linux expert. I have been using linux for many years, but I just know enough of it to let it me work. I just learn what I need. So please, go into detail, if you wish like doing it.

Thanks in advance.

**reba** · 22 December 2021, 09:02 AM

ATA-drives:
package (Debian): smartmontools
program: smartctl
example test (long): smartctl /dev/sda --test=long
example see log: smartctl /dev/sda -x
help: man smartctl

NVMe-drives:
package (Debian): nvme-cli
program: nvme
example check (long): nvme device-self-test /dev/nvme0n1 -n 1 -s 2
example see log: nvme self-test-log /dev/nvme0n1
help: man nvme

**sdack** · 22 December 2021, 10:58 AM

Originally posted by AlexFonewn View Post

How are you using Btrfs for that?

BTRFS should have file checksums enabled by default unless you have it disabled in your mount options (called datasum and nodatasum).

F2FS can use checksumming, but only as part of its file compression.

EXT4 uses checksums for its superblock information and keeps it redundant and automatically repairs it. It also uses a checksum for its journal to avoid applying a possibly corrupted journal after a recovery. It does however not support a checksum for each and every file.

Note that checksums do cost some CPU time and can slow down file access. If this is important to you, then you can still use selective checksumming tools that allow you to create individual checksums for files so it does not create a checksum for each and every file every single time. See the man page to cksum for details (GNU coreutils).

Checksum errors can occur however already in memory and may not be caused by a drive itself. When you do suspect that your drive is the cause for the corruption then you should check it directly. When you are uncertain then you should also check your memory, cabling, and run stress tests.

**GreenReaper** · 22 December 2021, 11:28 AM

Originally posted by Eumaios View Post

I searched for smarttools but couldn't find it, though I found many links to smartmontools. Is that what you use, or is there another package called smarttools?

This is it, the confusion comes from the tools not having 'mon' in, such as 'smartd' and 'smartctl'.

As for BTRFS: I use it, but sparingly - mostly in places where mdadm isn't quite sufficient. I've been bitten in the last couple of years by an error in writing delayed extents (which resulted in six hours of writes not getting to disk - pretty bad for an online service), and a problem where additional meta blocks could not be allocated, nor could a balance clear up space from the removal of deleted files - this came about just by filling up the drive. These weren't related to use of any special functionality, although I've seen others have issues with combinations of that, too. SuSE staff has been working to fix some of these issues and they deserve a lot of credit for it.

Regarding scrubs: because they can take some time, I've found that it's helpful to have a cron to start it off and then pause/cancel and resume it in two hour blocks every day to spread the load across the week in the quiet hours.

zstd has been generally useful (most recently to compress backups via tar) and I've had no problems I can recall, yet.

**benwhite** · 23 December 2021, 03:21 PM

Originally posted by timofonic View Post

I never use Facebook nor Instagram or any of their products and consider them very evil (and fatal for teenagers), but Zstd is their only good product.

Btrfs is a total failure.

Congratulations, Facebook = META. You did something good, finally

Meh, btrfs works great if you know what you're doing.

**benwhite** · 23 December 2021, 03:22 PM

Originally posted by AlexFonewn View Post

Please elaborate on that. I am very interested in being protected against storage data corruption. I has happened to me once already, luckily I was also storing dated-backups and could find the original un-corrupted original file. (the corrupted file had made it into several of the dated-backups, and of course into the "live" backup, the one I maintain as current, with rsync.)

How are you using Btrfs for that?

I am not a super linux expert. I have been using linux for many years, but I just know enough of it to let it me work. I just learn what I need. So please, go into detail, if you wish like doing it.

Thanks in advance.

look up the btrfsmaintenance tools. scripts written by the primary programmer for btrfs. well worth installing whether or not your distro has them in its repos.

Announcement

Zstd 1.5.1 Released With Even More Performance Improvements

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment