Announcement

Collapse
No announcement yet.

Ted Ts'o: EXT4 Within Striking Distance Of XFS

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by misiu_mp View Post
    Multimedia files usually tolerate being slightly corrupted. It might show as an small artefact in a movie or an image or a crack an audio stream.
    Applications such as video hosting would probably be great candidates for fast but less secure storage.
    I guess long-play DV tapes could be an example of tradeoffs there: DV pushes the limits of the tapes even at standard recording rates, and switching to long-play on the tape gives you 50% more recording time but pretty much guarantees that you'll get dropouts. The codecs are good enough at detecting and hiding such dropouts that most people don't even realise they happened; for example, if a block of pixels is corrupted they typically just output the same block of pixels from the previous frame, so if the camera isn't moving much you probably won't even see the droput.

    Comment


    • #32
      Originally posted by smitty3268 View Post
      No, it's pretty clear you're the one who isn't understanding. Everyone agrees that in certain cases it's important to have data integrity. What you don't seem to get is that in certain situations it is perfectly acceptable to have data errors. Even if you don't know they are there. This has been explained to you, but you keep repeating the same stuff so I'm not sure if you're ignoring us or just don't understand the concept.
      It seems you are confused or don't understand the issue completely. kebabbert is right. If you save data, it is because you want to have it back later, otherwise you would not save the data.

      More important is data safety than speed, but you can get both with ZFS. Why do you think that in the LHC they use ZFS and not any thing else?

      Comment


      • #33
        Originally posted by misiu_mp View Post
        Multimedia files usually tolerate being slightly corrupted. It might show as an small artefact in a movie or an image or a crack an audio stream.
        Applications such as video hosting would probably be great candidates for fast but less secure storage.
        Ok, so multimedia files could be put on a less secure filesystem?

        So... what if the metadata about a file is corrupted? Then you can not open the file. Is that acceptable?

        Comment


        • #34
          Originally posted by kebabbert View Post
          So... what if the metadata about a file is corrupted? Then you can not open the file. Is that acceptable?
          Yes, if you have another copy. For example, from what I've read about large-scale NoSql databases they'll typically store multiple copies of data on different machines, so in theory if you tried to read the data from one machine and it was corrupt, you could ask a different machine instead.

          If you're already using redundant machines then you may not care if an individual machine loses data if that means you can improve disk performance. From what I've read about Google, they're used to entire servers failing and they just replace them when they get time because the data is duplicated elsewhere.

          Comment


          • #35
            Originally posted by movieman View Post
            Yes, if you have another copy. For example, from what I've read about large-scale NoSql databases they'll typically store multiple copies of data on different machines, so in theory if you tried to read the data from one machine and it was corrupt, you could ask a different machine instead.
            You are describing a safe solution.

            We talk about unsafe solutions, where corrupted data is allowed.

            Comment


            • #36
              Originally posted by kebabbert View Post
              We talk about unsafe solutions, where corrupted data is allowed.
              I thought we were talking about filesystems which trade performance for reduced reliability?

              Comment


              • #37
                Originally posted by movieman View Post
                I thought we were talking about filesystems which trade performance for reduced reliability?
                We are, whenever Kebabbert loses an argument he tries to redefine it.

                Comment


                • #38
                  As mentioned before, Google is a good example of a company willing to trade performance for data integrity. I brought them up originally because i believe they are the ones who created the no-journaling patch for ext4.

                  In their case, they have multiple copies of the data scattered around the world and so whenever 1 copy gets corrupted they just take it offline and serve the data from somewhere else until it gets replicated back again. In fact, they have to do this no matter what FS they use, because they use so much hardware they are constantly getting defects going on. The disks physically stop working, and at that point if they don't have another copy they're screwed anyway.

                  Furthermore, in case there is a small error the chances of any customer actually seeing it are very slim. Most likely it will show up as a small speck on a video file, or perhaps 1 character wrong in a search result somewhere (buried 100 pages down in the results list). Compare that to performance, which affects every single user the have all the time. They've done studies where slowing down the response time by only a few tens of milliseconds directly results in much fewer searches being performed, so that makes a big difference in terms of how much advertising revenue they bring in. So it makes sense that they would do this.

                  Now, obviously with something like finance it's not acceptable to mess up a number. That's a different situation, and probably even a more realistic situation for more people. It's just not the only one.

                  Comment


                  • #39
                    The comedian

                    Multimedia files don't glitch because of the bad bytes, it's the compression algorithm.

                    File-systems are fault-tolerant. You run RAID arrays to ensure data consistency. If your data is seriously important you run n mirrors.
                    Where n is a safe number you expect to contain the fault.

                    Tso's analysis comparison between Football and Software Development is a case of playing Sideline Quarterback. Nobody knows what will work until they are playing the game. We've been playing for years and the coaches aren't listening.

                    Development needs to slow down. The Phoronix tests demonstrate an actual Football game. Scenarios are the practices most teams run in preparation for a big game. Stat the Quarterback's speed, food capacity, urine content, and IQ all that you won't you'll never see what's he's capable of unless you got him catching snaps and slinging pigskin.

                    To measure performance, Kernel Developers are looking at raw data which is WRONG! User experience is more important. Linus, Andrew, and etc can't seem to grasp the majority don't have 6-core chips and 16 GB's of ram. So we perceive time relative to moving GB files and the mouse staggering across the screen. We have patches to fix that problem but Coach Linus isn't in a hurry to put the rookies' code on the grill.

                    Tso and everyone else, refer to the kernel tests over 5 years article. 2.6.14 nailed "176MB/s with the Linux 2.6.14 kernel."

                    Consider that with a responsive system as well. When you're pumping a hell of a lot of data on a responsive system you get a snazzy feeling. I suggest testing on an Acer Aspire.

                    I use a 600Mhz Pentium 3 and 1 Ghz Celeron for my least cases. You went to college, remember The Scientific Method. Remember The Scientific Method.

                    --
                    If you're going to do something then do it right or don't do it at all
                    If doing it the right way ends up breaking things; then do it the wrong way
                    squirrl

                    Comment


                    • #40
                      Originally posted by smitty3268 View Post
                      We are, whenever Kebabbert loses an argument he tries to redefine it.
                      When have I lost an argument? Can you please link to a post that shows I loose an argument or when I redefine it?

                      On the other hand, I can show links where you lie. For instance, here you claim that you have proved me wrong, and I ask you to show that. You never showed links where you "prove me wrong", because there are no such links. False claims about me.
                      http://www.phoronix.com/forums/showp...&postcount=106

                      Here I show that you lie again.
                      http://www.phoronix.com/forums/showp...6&postcount=88

                      In fact, I suspect you have lied in other posts as well. For instance, you claimed that New York Stock Exchange are very happy now:
                      http://www.phoronix.com/forums/showp...&postcount=159
                      Originally posted by smitty3268 View Post
                      Didn't the NYSE recently switch from Unix to Red Hat, and that's about as mission critical as things come. From everything I've heard, they've been extremely happy with Linux since the switch.
                      I doubt you know people at NYSE, and from your earlier well known track record I suspect you make this one also up. Because I work in finance, and I have heard the opposite. As has frantaylor, who explains that NYSE is very very cautious about their Linux switch:
                      http://phoronix.com/forums/showpost....4&postcount=64

                      So, again, show me links where I loose an argument, or when I redefine the argument. Most probably you can not show those links about me because there no links, so this is probably your normal FUD, just as usual.

                      Comment


                      • #41
                        Originally posted by smitty3268 View Post
                        In their case, they have multiple copies of the data scattered around the world and so whenever 1 copy gets corrupted they just take it offline and serve the data from somewhere else until it gets replicated back again.
                        It seems you dont really understand what I am talking about.

                        How can Google notice if there is a corruption in a file? Many storage solutions (filesystems, hw raid, etc) can not detect all corruptions, especially not Silent Corruption.

                        Comment


                        • #42
                          Originally posted by kebabbert View Post
                          You dont understand what Data Integrity is.

                          It is not about a disk crashes or something similar. It is about retrieving the same data you put on the disk. Imagine you put this data on disk: "1234567890" but a corruption occured so you got back "2234567890". And the hardware does not even notice the data got corrupted. This is called Silent Corruption and occurs all the time.
                          What stupidity and nonsense!!, you clearly prove that you don't understand what is BER, silent corruption, CRC , convolutional codes and theory of information!!!

                          Please don't speak of things you don't know or understand.

                          Comment


                          • #43
                            Originally posted by Jimbo View Post
                            What stupidity and nonsense!!, you clearly prove that you don't understand what is BER, silent corruption, CRC , convolutional codes and theory of information!!!

                            Please don't speak of things you don't know or understand.
                            Then, explain it to us, please.

                            Comment


                            • #44
                              Originally posted by kebabbert View Post
                              When have I lost an argument? Can you please link to a post that shows I loose an argument or when I redefine it?
                              Sure, how about this one from just a few posts up?

                              You are describing a safe solution.

                              We talk about unsafe solutions, where corrupted data is allowed.
                              Skipping over the rest of the stuff where you accuse me of lying, because frankly i'm not even interested in going over this again...

                              I doubt you know people at NYSE, and from your earlier well known track record I suspect you make this one also up.
                              No i don't, i read an article about it in the Wall Street Journal, and have heard news reported from other sources as well. If you're claiming some super-secret inside knowledge, then OK. But everything I've heard publicly reported was that they were very happy.

                              Comment


                              • #45
                                @KDesk

                                You don't get that type of errors!!. If a small corruption occurs lets say 1 bit, convolution code can repair it and you get your original data, if a big error occurs , crc detects it and you get a read error. If you put a 400 in your excel you don't read 800 when there is an read error.

                                What CERN is talking about?

                                CERN reported errors are special corner case of RAID 5 arrays, when the firmware!!! (not the file system) of the raid controller is introducing an error due to malfunction, it is writing data on wrong places, then you get this type of corruption, its a corner case, on raid 5 arrays and only few controllers are affected. ZFS can workaround this rare error, others file system doesn't.

                                On raid controllers complaint with T10 Data Integrity Field standard, this bug doesn't exits, you can setup your raid 5 safely using ext4.

                                Comment

                                Working...
                                X