Announcement

Collapse
No announcement yet.

EXT4 In Linux 3.5 Gets CRC32 Meta-Data

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • EXT4 In Linux 3.5 Gets CRC32 Meta-Data

    Phoronix: EXT4 In Linux 3.5 Gets CRC32 Meta-Data

    The pull request for the EXT4 file-system in the Linux 3.5 kernel and there's one prominent new feature...

    http://www.phoronix.com/vr.php?view=MTExMTc

  • #2
    On-disk change , what about compatibility?

    Comment


    • #3
      https://lkml.org/lkml/2012/6/1/331

      It's an on-disk change, but it's gated by a superblock "feature flag".
      So unless you actually activate the feature, you won't get it. If you
      do activate the feature, then you won't be able to switch between
      older and newer kernel versions --- at least not and be able to mount
      the file system read/write. (We have different feature flags that
      indicate whether or not the kernel is allowed to mount the file system
      read/write, read/only, or not at all, if it doesn't know about a bit
      in the COMPAT, COMPAT_RO, or INCOMPAT feature bitmak.)

      The e2fsprogs support for this feature is currently only in the
      (rewinding) proposed update branch, so it's not something that I
      recommend people use just yet.
      Even though it's been pretty well
      tested, there are probably still some bugs we still need to shake out.

      - Ted

      Comment


      • #4
        I was wondering about that too. It would be expected that the user would have to do something to get the new feature since it is not backwards compatible.

        I'd say this is one of the reasons why every once in a while one should do a full backup, reformat and copy everything back

        Comment


        • #5
          So looks like we'll have to reformat to use this feature (in the future!), or will there be some conversion tooling? I know i could change the flag with tune2fs, but the metadata would needed to be rewritten for the crc32 value field?

          Comment


          • #6
            Checksum

            What about CRC-64?

            What about using a secure hashing algorithm such as SHA-1 or Skein?

            Comment


            • #7
              Originally posted by uid313 View Post
              What about CRC-64?

              What about using a secure hashing algorithm such as SHA-1 or Skein?
              Don't feed the troll, I'll bite anyhow.

              CRC is used to verify that the data inside the block is not corrupted. it's a redundancy check. Is my super-block valid, or not.

              CRC-64 is not that useful in this specific scenario.

              http://en.wikipedia.org/wiki/Cyclic_redundancy_check

              SHA-1 is completely useless as that is a cipher. The point was not to encrypt the metadata, just to verify it.

              Comment


              • #8
                Originally posted by oliver View Post
                Don't feed the troll, I'll bite anyhow.

                CRC is used to verify that the data inside the block is not corrupted. it's a redundancy check. Is my super-block valid, or not.

                CRC-64 is not that useful in this specific scenario.

                http://en.wikipedia.org/wiki/Cyclic_redundancy_check

                SHA-1 is completely useless as that is a cipher. The point was not to encrypt the metadata, just to verify it.
                No, SHA-1 is not a cipher, it is a hash algorithm. It does not encrypt anything.

                SHA-1 is very suitable to verify the integrity of data. Not only does it verify that the data is not corrupted, it also verifies the integrity of the data, that it has not been tampered with.
                Last edited by uid313; 06-01-2012, 06:34 PM.

                Comment


                • #9
                  Originally posted by uid313 View Post
                  No, SHA-1 is not a cipher, it is a hash algorithm. It does not encrypt anything.

                  SHA-1 is very suitable to verify the integrity of data. Not only does it verify that the data is not corrupted, it also verifies the integrity of the data, that it has not been tampered with.
                  There's a reason every FS and DB I've ever heard of that added checksums uses simple CRC hashing.

                  It's a simple algorithm that is well suited to the task.

                  You could replace it with SHA-1, but what would that give you? A much more complicated and CPU-taxing algorithm, and nothing else. Crypto hashes have to be a lot more complicated so that small changes to the data provide pseudo-random changes to the hash that can't be reverse-engineered. There's isn't a single good reason to want that overhead in an integrity check.

                  If you want to verify that the data hasn't been tampered with, you should just encrypt the whole FS.

                  Comment


                  • #10
                    Hopefully by the time F18/F19 will be released e2fschk will support on-disk upgrade giving an option, via anaconda, to switch existing ext4 to CRC-check-summed.
                    As it stands, I doubt that I'll be using btrfs before 2014 and better error detection in existing stable FS is always welcome.

                    - Gilboa
                    DEV: Intel S2600C0, 2xE52658V2, 32GB, 4x2TB + 2x3TB, GTX780, F21/x86_64, Dell U2711.
                    SRV: Intel S5520SC, 2xX5680, 36GB, 4x2TB, GTX550, F21/x86_64, Dell U2412..
                    BACK: Tyan Tempest i5400XT, 2xE5335, 8GB, 3x1.5TB, 9800GTX, F21/x86-64.
                    LAP: ASUS N56VJ, i7-3630QM, 16GB, 1TB, 635M, F21/x86_64.

                    Comment


                    • #11
                      Originally posted by smitty3268 View Post
                      There's a reason every FS and DB I've ever heard of that added checksums uses simple CRC hashing.

                      It's a simple algorithm that is well suited to the task.

                      You could replace it with SHA-1, but what would that give you? A much more complicated and CPU-taxing algorithm, and nothing else. Crypto hashes have to be a lot more complicated so that small changes to the data provide pseudo-random changes to the hash that can't be reverse-engineered. There's isn't a single good reason to want that overhead in an integrity check.

                      If you want to verify that the data hasn't been tampered with, you should just encrypt the whole FS.
                      SHA-1 and the various hash based algorithms will give you a much higher error detection rate than a simple crc-32, a crc-32 can only detect certain types of errors and certain degrees of errors within that (i.e it can for example catch burst errors but not if they are over X bits of length). They do introduce a major overhead though, however since this is for the meta-data only anyways, the performance overhead shouldn't be that much.

                      Comment


                      • #12
                        Originally posted by smitty3268 View Post
                        You could replace it with SHA-1, but what would that give you? A much more complicated and CPU-taxing algorithm, and nothing else. Crypto hashes have to be a lot more complicated so that small changes to the data provide pseudo-random changes to the hash that can't be reverse-engineered. There's isn't a single good reason to want that overhead in an integrity check.
                        These days you have hardware-accelerated cryptography, I know modern Intel processosrs support AES-NI (AES instruction set). Don't know if it do hardware-accelerated hashing though.

                        Comment


                        • #13
                          Modern x86 (and IIRC SPARC) processors have hw accelerated CRC32C. "Luckily", the exact same polynomial is used in the btrfs and ext4 crc checksum, among others, allowing those to benefit from the hw acceleration.

                          Comment


                          • #14
                            Via's cpus have supported hardware SHA-1/SHA-256 for years. Many ARM cpus also support it in hw.

                            Comment


                            • #15
                              Does it checksum metadata only? Is there checksum support for file's data?

                              Also, CRC32 seems a little bit weak.
                              As far as i know, CRC32 is used by many hardware internally for error correction.
                              When data corrupted and undetected in hardware level (which means the hardware CRC32 test probably passed without error), i doubt how reliable that ext4's CRC32 can detect corruption.

                              At least an alternative implementation of MD5 should be provided, the overhead should be acceptable if those crypto library are written in SIMD instruction (MMX, SSE)
                              Last edited by unknown2; 07-21-2012, 08:35 PM.

                              Comment

                              Working...
                              X