Announcement

Collapse
No announcement yet.

Linux 5.1 Hit By A Data Loss Bug Due To Overly Aggressive FSTRIM

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Well my main laptop has a Samsung SSD and the 5.1 kernel on an encrypted root partition. But I'm using Btrfs, so I have no need for LVM. Also, I don't use discard on encrypted drives. On a drive that's filled with what appears as random data, discard would go through and zero out sections of it based on usage patterns.

    Comment


    • #12
      I am pretty sure it is actually better to only trim periodically, both performance- and longevity-wise, anyway (and should be the default?)

      Comment


      • #13
        Is it the third data corruption bug for the past several months? The Linux kernel is so darn stable.

        Comment


        • #14
          Originally posted by [email protected] View Post
          I am pretty sure it is actually better to only trim periodically, both performance- and longevity-wise, anyway (and should be the default?)
          Do you have any links for why that would be a good idea ? I tend to use lower-end hardware (I'm not a gamer, building the software is enough of a game), and my preference for SSDs is SanDisk. I'm not sure how old my oldest SSD is, but I see that I've been using ',discard' in /etc/fstab for several years without any problems.

          OTOH, I'm reluctant to use Crucial drives (apparently highly rated by some PC magazines in this country, but running smartctl shows a lack of data re self-tests). So maybe it is a drive-specific (or technology-specific) problem ?

          When I got my first ryzen desktop in April last year, I accidentally forgot to add that. In June I was running some compile tests (re power consumption if I disabled cpufreq) and noted they were taking longer and longer. Then I ran fstrim and added it to the fstab, and things settled down. So, my view is that "automatic" fstrim (on ext4, without LVM or dm-crypt) is the way to go.

          Comment


          • #15
            Fixed by https://git.kernel.org/pub/scm/linux...c5f46072f7520d

            Comment


            • #16
              Originally posted by birdie View Post
              Is it the third data corruption bug for the past several months? The Linux kernel is so darn stable.
              Sounds to me more like yet another Samsung SSD firmware bug.

              Comment


              • #17
                Originally posted by zerothruster View Post

                Do you have any links for why that would be a good idea ? I tend to use lower-end hardware (I'm not a gamer, building the software is enough of a game), and my preference for SSDs is SanDisk. I'm not sure how old my oldest SSD is, but I see that I've been using ',discard' in /etc/fstab for several years without any problems.

                OTOH, I'm reluctant to use Crucial drives (apparently highly rated by some PC magazines in this country, but running smartctl shows a lack of data re self-tests). So maybe it is a drive-specific (or technology-specific) problem ?

                When I got my first ryzen desktop in April last year, I accidentally forgot to add that. In June I was running some compile tests (re power consumption if I disabled cpufreq) and noted they were taking longer and longer. Then I ran fstrim and added it to the fstab, and things settled down. So, my view is that "automatic" fstrim (on ext4, without LVM or dm-crypt) is the way to go.
                I read a while back that you don't need to trim/discard that often, and it can actually be better/more-efficient/lower-wear to only do it periodically. The space waiting to be trimmed/discarded is unavailable to use, but as long as you're not pushing the limits on the space on the drive it shouldn't matter. The suggestion I saw was to do it weekly in a cron job.

                I haven't actually done this, but I have an SSD sitting on my desk waiting to be installed, and decided I wanted to wait for 5.2 in order to pick up some F2FS improvements - plus a lack of time. I'd done some research and kept my ears open.

                Comment


                • #18
                  Originally posted by phred14 View Post
                  The space waiting to be trimmed/discarded is unavailable to use, but as long as you're not pushing the limits on the space on the drive it shouldn't matter.
                  It has nothing to do with the free space that the filesystem reports. Basically, when a block is no longer used, TRIM should occur on that block before data is written to it again. There are two reasons for this - the first is for better performance, and the second is to reduce wear on the SSD. If TRIM was not run on a block before new data is written to it, that actually causes more writes to occur internally in the SSD.

                  But as I mentioned earlier, TRIM has the same effect as zeroing out blocks of data, which causes some concerns for encrypted drives. I doubt the security concerns are that significant, but I also don't think that the performance and wear costs of not using it are that significant either (at least in my use cases).

                  Comment


                  • #19
                    Originally posted by phred14 View Post

                    I read a while back that you don't need to trim/discard that often, and it can actually be better/more-efficient/lower-wear to only do it periodically. The space waiting to be trimmed/discarded is unavailable to use, but as long as you're not pushing the limits on the space on the drive it shouldn't matter.
                    Thanks. I've seen reports that running it periodically is adequate. But when I discovered that repeated compiles of the same packages were getting slower and slower (from never using fstrim on a newish drive) both the drive and the partitions on it appeared to have plenty of available space according to 'df'.

                    Comment


                    • #20
                      Originally posted by Chugworth View Post
                      It has nothing to do with the free space that the filesystem reports. Basically, when a block is no longer used, TRIM should occur on that block before data is written to it again. There are two reasons for this - the first is for better performance, and the second is to reduce wear on the SSD. If TRIM was not run on a block before new data is written to it, that actually causes more writes to occur internally in the SSD.

                      But as I mentioned earlier, TRIM has the same effect as zeroing out blocks of data, which causes some concerns for encrypted drives. I doubt the security concerns are that significant, but I also don't think that the performance and wear costs of not using it are that significant either (at least in my use cases).
                      Interesting. You are correct that the reported freespace is not the same as what is available from the drive's point of view. And for the wear costs we none of us have any idea until a drive either dies or reports SMART problems (which reminds me of a reason not to buy intel SSDs - they were said to eventually go r/o, then fail on next boot). But for performance (my main activity is compiling) the results of not using trim were very apparent on my machine.

                      Fortunately (for me) I don't need encryption - this machine is unlikely to go outside my home. It sounds as if encryption is the critical item in what started this whole thread.

                      Comment

                      Working...
                      X