Announcement

Collapse
No announcement yet.

LZO & LZ4 Security Vulnerabilities Disclosed

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • LZO & LZ4 Security Vulnerabilities Disclosed

    Phoronix: LZO & LZ4 Security Vulnerabilities Disclosed

    The latest open-source security issues uncovered affect LZO and LZ4 and the issues run back years...

    http://www.phoronix.com/vr.php?view=MTczMTQ

  • #2
    Calm down.

    Calm down, that is grossly overblown.

    Comment


    • #3
      Originally posted by Karl Napf View Post
      Calm down, that is grossly overblown.
      That's right in sense it is not likely to cause massive pwnage. But still...
      1) LZO is here for a while. It used by many programs. Reverse lookup in ubuntu repos gives 30+ dependants. I guess there're far more lurking around. So its hard to tell what exactly can be affected. Fortunately, bug mostly allows to cause crash but seems to be hard to upgrade to code execution since its not possible to supply meaningful code in process, you can just corrupt some memory. However, while it is hard to use, historically some exploits were able to (ab)use "just" memory corruption to trigger other undesired activity. So it is better not to underestimate these bugs.
      2) In LZ4 it somewhat worse: it looks like bug allows attacker to supply meaningful code as part of data and then execute it. So it can be upgraded into code execution. Fortunately, LZ4 author is right that most programs are using LZ4 in ways it would not work due to smaller block sizes - 16M chunks is heck a lot. However, it's not like if there is full exhaustive list of all users of particular library and technically its possible to use LZ4 in ways allowing to exploit that bug. Since it allows code execution, it could be unpleasant.

      So it is really wise idea to update these 2 libs. Before blackhats will figure out what could be pwned and how to do it. After all, updating these libs gives no disadvantages.

      And curious fact is that LZO and similar algos are here for about 20 years. But somehow it has only came to attention recently that some properties of compressed data streams could in some cases be abused in really uncommon ways to fool decompression engine.
      Last edited by 0xBADCODE; 06-28-2014, 01:16 PM.

      Comment


      • #4
        Originally posted by 0xBADCODE View Post
        So it is really wise idea to update these 2 libs. Before blackhats will figure out what could be pwned and how to do it. After all, updating these libs gives no disadvantages.
        Oh, of course. But this is not a heartbleed kind of bug, just something that -- as far as we know at this time -- could potentially be used to cause a bit of mischief with programs that use those algorithms in ways that the default implementation does not support.

        Of course ingenious people might find ways to turn that mischief into a real exploit, but I actually doubt that.

        Originally posted by 0xBADCODE View Post
        And curious fact is that LZO and similar algos are here for about 20 years. But somehow it has only came to attention recently that some properties of compressed data streams could in some cases be abused in really uncommon ways to fool decompression engine.
        To me this feels more like a deliberate design decision than an actual exploit. Memory was expensive back then, so people used to go for "shortest int possible" in any situation. This is actually a example of this: The integer does not overflow when the block size is in the allowed range (<=8MiB). Any bigger blocks should already have been rejected before they are passed on to this algorithm.

        This is more a expression on how things have changed in the last 20 years... away from "safe as much RAM as possible" to "as secure as possible". I actually think that shift is a good thing, but I also find the media coverage hugely out of proportion to the issue at hand. Some articles I read make it sound like this is a bug that lets hackers pawn the curiosity rover on Mars!

        <Fixed some typos>

        Comment


        • #5
          The theoretical risk would be Firefox&gt;gstreamer&gt;ffmpeg&gt;root shell but seems impossibl

          Originally posted by Karl Napf View Post
          Oh, of course. But this is not a heartbleed kind of bug, just something that -- as far as we know at this time -- could potentially be used to cause a bit of mischief with programs that use those algorithms in ways that the default implementation does not support.

          Of course ingenious people might find ways to turn that mischief into a real exploit, but I actually doubt that.
          My guess is the useful exploit would be to post a boobytrapped video file that contained the special blocks, presumably porn if the targets are presumed male. The video plays in Firefox via gstreamer-ffmpeg, hence reaching the target library by ffmpeg. The attacker would now require a privlige escalation attack to get a remote connection and a root shell. There's just one problem:

          http://fastcompression.blogspot.de/2...-bug-myth.html

          Implies that this attack would be nearly impossible, and states that on any 64 bit operating system(like most of mine) the vulnerability does not exist at all. It roughly boils down to this:

          It would require the targetted program to feed at least 16MB blocks to LZO or LZ4, but most programs using these cut off at 8MB and error out on oversized blocks. In essence a form of external bounds checking is being used. OK, now the attack requires socially engineering someone into opening some kind of self-extracting tarball using custom code that calls LZ0 directly and does not error out on the oversized blocks. Possibly the files would claim to contain a directory full of celebrity nudes?

          The article states that the main reason to fix this vulnerablity is exactly that: the possibliity that custom code in the future might not limit block sizes and would transform this from a theoretical to a real vulnerability. Example: we don't know what kind of video codecs might be written in the future, possibly for 4K video, possibly as a result trying larger block sizes.

          Comment


          • #6
            Originally posted by Karl Napf View Post
            Oh, of course. But this is not a heartbleed kind of bug,
            Sure, it would not be as harmful as that. And if we're about heartbleed, it have to be expected and likely there're some other "funny" bugs. SSL and TLS are horribly overengineered, trying to solve all kinds of humankind problems at once. It makes it hard to implement them without bugs. To make things worse, they also come with ton of legacy/compatibility stuff, which allows to try all sorts of nasty tricks. And then, OpenSSL is written by people who are not security experts at all. This leads to some silly shortcomings (in terms of crypto usage). So it haves weird api, questionable defaults and dangerous shortcomings. What do you expect? To be secure? LOL! You can only use TLS or SSL in secure way if you're cryptography guru. Especially if you're about to use OpenSSL, which makes it really easy to shoot own legs. Obviously, authors of most apps are not crypto gurus. So you can expect even more "funny" things to happen here and there.

            Fortunately, LZ-based algos are different. They are simple. That's what makes its really uncommon and noteworthy to encounter bugs in LZ stream parsing. Even more unusual if they're lurking around for 20 years.

            Of course ingenious people might find ways to turn that mischief into a real exploit, but I actually doubt that.
            There were examples of "indirect" attacks, where attacker can mount proper playground in indirect ways with limited tools by tricking legitimate code to do something that helps attacker's plan. So I would consider such bugs unsafe.

            away from "safe as much RAM as possible" to "as secure as possible".
            Interestingly, it haves nothing to do with RAM usage. Bug mostly comes to insufficient input data validation. Fix wouldn't increase RAM usage. Btw, LZO (and IIRC "usual" LZ4) can be decompressed "without memory" at all. In sense it only takes input data and place to store output. But no other memory required (except maybe few bytes to keep some variables if they do not fit CPU registers). Also LZO and LZ4 compressors have compression levels with modest memory requirements (several kilobytes). So these algos are used even in small embedded systems with tight memory requirements, especially decompressors.

            The problem is different. Earlier, many computers were not directly connected to nets or had limited connectivity. Data sets they were processing were mostly fixed and came from (more or less) trusted sources. So devs can "just write code". Code had to implement desired logic. Without bothering self about corner cases and strange input. Now everything is networked, humans exchange huge amounts of data. And there is trouble on the way: these data can't be trusted anymore...

            So, now it's not your best friend gives your program data to chew on. Its now your worst enemies are seeking for free resources and valuable data. And they will feed your programs with all sorts of weird crap if it helps to achieve their goals. So devs should change their ways of thinking. External world proven to be hostile. You can expect absolutely worst things. Its not compressed stream. Its tool for attackers to control your decompressor. Should it succeed, attacker wins. Its not "just picture". It is format parser and compression decoder. Should parser or decoder allow self to be tricked into doing something wrong, attacker wins. Its not "just password" user enters into your web page. It can be SQL injection! Or something else, intended to fool internals of your program processing this input. It's no longer text you type in forum message. It can be JS code to overtake users to another server, or SQL injection, or something else. It no longer "just file" user uploads to your server. It can be script trying to integrate with your CMS by fooling CMS to execute this crap. Should is succeed, attacker will execute "remote shell" script and will gain some free resources and valuable data. You see, you can't trust incoming data anymore. And those who thought otherwise will face hard times.

            but I also find the media coverage hugely out of proportion to the issue at hand.
            These days world is full of automatic activity seeking for free resources and data to steal. Just set up your web server or read sshd logs. You will see what I mean. These things are working without sleep and rest. They know no mercy. You and your computer are just some free resources and valuable data for them. So every bug which can be abused, will be abused to get even more free resources and valuable data. This seriously increases requirements for external data validation and ability of programs to deal with uncommon corner cases.

            Comment


            • #7
              If the block sizes needs to be bigger then the 8mb the LZ4 file format specify as allowed in order for the exploit to work, why is this even considered a bug?

              I could just as well say every encryption archive is compromised since people may use 1234...

              Comment


              • #8
                Originally posted by Luke View Post
                My guess is the useful exploit would be to post a boobytrapped video file that contained the special blocks, presumably porn if the targets are presumed male. The video plays in Firefox via gstreamer-ffmpeg, hence reaching the target library by ffmpeg. The attacker would now require a privlige escalation attack to get a remote connection and a root shell. There's just one problem:

                http://fastcompression.blogspot.de/2...-bug-myth.html

                Implies that this attack would be nearly impossible, and states that on any 64 bit operating system(like most of mine) the vulnerability does not exist at all. It roughly boils down to this:

                It would require the targetted program to feed at least 16MB blocks to LZO or LZ4, but most programs using these cut off at 8MB and error out on oversized blocks. In essence a form of external bounds checking is being used. OK, now the attack requires socially engineering someone into opening some kind of self-extracting tarball using custom code that calls LZ0 directly and does not error out on the oversized blocks. Possibly the files would claim to contain a directory full of celebrity nudes?
                I think that if you get the user to execute some code (self extraction program), it's not very useful to use such code to try to use some vulnerability that would allow you to run code with the same privileges.

                Comment


                • #9
                  Originally posted by c117152 View Post
                  If the block sizes needs to be bigger then the 8mb the LZ4 file format specify as allowed in order for the exploit to work, why is this even considered a bug?
                  Because data compression is one topic. Lib takes data and compresses (or decompresses) them. Then, encapsulation/transport format and block sizes are another topic. Technically its possible and okay to use some custom transport format which larger blocks. In some cases larger blocks can work better and you can be unhappy with 8Mb limit. Then there is nothing wrong to ask compression lib - "please decompress me these data, 20Mb". Compressed stream on its own allows this. Unfortunately it turned out that in such case decompression code would boom in uncommon way and can even execute arbitrary code from supplied data. Which isn't something you would expect when asking to decompress you "these 20Mb of data". So from formal point of view it is a bug in decompression code.
                  Last edited by 0xBADCODE; 06-29-2014, 09:10 PM.

                  Comment


                  • #10
                    Yeah, it's a theoretical risk that it is better to plug, just in case such a situation happen in the future.

                    However, the big headline on all those web sites pushing the same pre-formatted news supposes that it is an immediate risk, with immediate consequences for everybody. Now people will think that they have a dangerous exploitable, bug on their machine. It's great advertisement for the security firm which "disclosed" the issue in a spectacular way, but it's simply untrue. The problem is totally blown out of proportion. We are experiencing a perfect example of crying wolf.

                    For more info, read : http://fastcompression.blogspot.fr/2...s-move-on.html

                    Phoronix should now better

                    Comment


                    • #11
                      Originally posted by 0xBADCODE View Post
                      ... And then, OpenSSL is written by people who are not security experts at all. ...
                      Spending a big amount of time working on a crypto library kind of makes you an expert:-) Maybe not a good programmer, but an expert non the less.

                      Originally posted by 0xBADCODE View Post
                      Fortunately, LZ-based algos are different. They are simple. That's what makes its really uncommon and noteworthy to encounter bugs in LZ stream parsing. Even more unusual if they're lurking around for 20 years.
                      That was no bug 20 years ago. Back then it was a "32bit int is enough here as that can not overflow with the allowed block sizes". In fact I have trouble calling this a bug today.

                      Originally posted by 0xBADCODE View Post
                      Interestingly, it haves nothing to do with RAM usage.
                      Oh, it does. The problem is a 32bit int, a fix is using a 64bit int. That is twice the size:-)

                      Originally posted by 0xBADCODE View Post
                      Bug mostly comes to insufficient input data validation. Fix wouldn't increase RAM usage.
                      A block size > 8MiB is not supported and nobody mentioned any implementation that does not do so, so where is the input validation missing? Yes, theoretically there may be some, and that is why upstream improved the library to not fail in such cases anymore. Great, let's move on.

                      Originally posted by 0xBADCODE View Post
                      The problem is different. Earlier, many computers were not directly connected to nets or had limited connectivity. Data sets they were processing were mostly fixed and came from (more or less) trusted sources. So devs can "just write code". Code had to implement desired logic. Without bothering self about corner cases and strange input. Now everything is networked, humans exchange huge amounts of data. And there is trouble on the way: these data can't be trusted anymore...
                      I do consider LZO/LZ4 to be a good example of making sure the input is sane, considering that I did not see anybody showing a concrete implementation that is broken.

                      Originally posted by 0xBADCODE View Post
                      These days world is full of automatic activity seeking for free resources and data to steal. Just set up your web server or read sshd logs. You will see what I mean. These things are working without sleep and rest. They know no mercy. You and your computer are just some free resources and valuable data for them. So every bug which can be abused, will be abused to get even more free resources and valuable data. This seriously increases requirements for external data validation and ability of programs to deal with uncommon corner cases.
                      Right, but what does this have to do with this issue? You have to get a user to execute a vulnerable piece of code, which you most likely need to write yourself since nobody mentioned broken users of LZO/LZ4 yet. If your users run code random people mail to them, then LZO/LZ4 is the least thing you need to worry about.

                      Comment


                      • #12
                        Originally posted by Karl Napf View Post
                        Right, but what does this have to do with this issue? You have to get a user to execute a vulnerable piece of code, which you most likely need to write yourself since nobody mentioned broken users of LZO/LZ4 yet. If your users run code random people mail to them, then LZO/LZ4 is the least thing you need to worry about.
                        Fundamentally:
                        - LZO and LZ4 are compression methods highly tuned for compression/decompression. They are designed to go as fast as possible. As such they are often used locally when speed matters the most. Like real-time filesystem compression. Like cache & RAM compression. Like very fast boot-loaders in embed machines.

                        - The web is about trying to save bandwidth. Sacrifice as much computing powers, if afterwards you data is smaller and transfered using fewer bytes. Things like 7z, Advzip, or even Zopfli exist which are slow as hell. But then you can shave a few byte, which once you factor in how many time this piece of data will get downloaded, will cost less data transfers. The dominant lossless algorithms are Deflate (for ubiquity) and LZMA (best general purpose contender).


                        You're not going to find LZ4 or LZO on the web much. They trade compression performance against speed. They run much more faster than Deflate, at the cost of worse compression. That's totally opposed to web principles of data-volume economy.
                        As a result you don't find LZ4/LZO part of any standard. They are simply not exposed to the web.

                        - Yes, in theory, you could do remote execution using the bug.
                        - In practice, you won't be able to find an interface to send your exploit to. The attack surface is very limited.

                        The only decompression algorithms to which you could send malicious data is Deflate. (data-link compression, png, etc. lots of parts of the web use it).
                        LZ4 LZO is inaccessible from the outside as they stand between the kernel and real-time transparent file-system compression (and if an attacker is able to corrupt the way data is physically stored on a disk partition, you already have bigger problems).

                        Comment


                        • #13
                          Originally posted by Karl Napf View Post
                          Spending a big amount of time working on a crypto library kind of makes you an expert:-) Maybe not a good programmer, but an expert non the less.
                          Wrong. It is your knowledge and skills what makes you an expert in some area. In cryptography and security it also takes some special way of thinking: good cryptography experts and security gurus should be paranoid enough to be able to avoid common pitfalls. Unfortunately, OpenSSL devs proven they lack this skill, which is absolutely mandatory if you're about doing something security-sensitive. It is not even about heartbleed, it is about overall project management and how they deal with some "potentially unsafe areas". As a blatant example of incompetence: if OpenSSL asked for hardware crypto acceleration, OpenSSL is dumbass enough to use hardware RNG directly. No, they do not mix hardware RNG with other entropy sources. They just take RNG output and directly fed it to apps using OpenSSL. Should there be backdoor in RNG, so it not as random as it seems, all generated keys will be predictable. Just as it happened once in Debian already due to "small optimization", which caused need to urgently rekey thousands of machines. It is so lame and blatant incompetence, that even just good devs who haven't lost their ability to think are able to pinpoint it. For example, Linux kernel devs were scared by whole idea to use just hardware RNG as single entropy source. Let's say they do not even call themselves experts in security and crypto. But are able to understand such basic things, unlike "experts" from OpenSSL. And its not joke. Grab recent source of Tor and read changelog: Tor is security sensitive stuff for sure. So Tor devs were forced to use own workarounds to fix OpenSSL idiocy. Cool, isn't it? Do you honestly things such solution from "experts" could be secure at all? Bah, that's very unlikely. Complicated protocol? Tons of legacy cruft? Lame devs? Set Sail for Fail.

                          That was no bug 20 years ago. Back then it was a "32bit int is enough here as that can not overflow with the allowed block sizes". In fact I have trouble calling this a bug today.
                          You see, it only takes 16MiB input data to cause wraparound. Let's say, by modern standards, 16Mb chunk of data isn't something terribly large. This in turn enough to wrap-around 32-bit register.

                          Oh, it does. The problem is a 32bit int, a fix is using a 64bit int. That is twice the size:-)
                          It is twice in size bitwise bit cam hold 2^32 times larger numbers. Needless to say, 4G * 16M input is waaaaaaaay larger thing and never used and even troublesome to transmit at all.

                          A block size > 8MiB is not supported and nobody mentioned any implementation that does not do so,
                          Once again, decompression algo is one thing and format used to deliver data is another thing. So it is completely valid idea to ask decompression engine to decompress that 16Mb chunk. But it is not valid if things will go boom instead. Decompression engine like this supposed to be generic enough.

                          so where is the input validation missing?
                          In decompression engine - LZ stream parsing failed to evaluate what happens if there will be more than 16M of specially crafted data.

                          I do consider LZO/LZ4 to be a good example of making sure the input is sane, considering that I did not see anybody showing a concrete implementation that is broken.
                          It's rather really noteworthy example how "better save than sorry" principle has been disregarded and forced some people to get nervous (ffmpeg/libav even released patches to plug hole if I remember correctly).

                          Code:
                          a vulnerable piece of code, which you most likely need to write yourself since nobody mentioned broken users of LZO/LZ4 yet.
                          From what I remember, ffmpeg has been potentially affected by LZO issue. While it not easy to exploit in practical ways, they had to release patches.

                          If your users run code random people mail to them, then LZO/LZ4 is the least thing you need to worry about.
                          Sure, but if you just write some code dealing with compression of data, last thing you may want to encounter is unexpected failure in decompression engine.

                          Comment


                          • #14
                            That's beyond ugly and invokes a possible NSA risk

                            Originally posted by 0xBADCODE View Post
                            As a blatant example of incompetence: if OpenSSL asked for hardware crypto acceleration, OpenSSL is dumbass enough to use hardware RNG directly. No, they do not mix hardware RNG with other entropy sources. They just take RNG output and directly fed it to apps using OpenSSL. Should there be backdoor in RNG, so it not as random as it seems, all generated keys will be predictable.
                            I have heard suspicions that Intel may have made it possible to use as nonrandomness in their RNG to export the CPU serial number over SSL. That, in turn, allows direct identification of a user who posted something they thought was private, if they used a credit card to buy their computer and the store recorded a computer serial number. If the computer was bought with cash and Windows never activated it would still permit identification after the fact if the poster is arrested with the computer, assuming the case is hot enough for the NSA to be willing to admit to this in open court.

                            This makes OpenSSL the worst possible place to use the RNG directly, and suggests blacklisting AES-NI to prevent such an export. There are many good reasons /dev/random and /dev/urandom XOR the RNG output with the traditional kernel RNG, and after the Snowden revelations many said "I told you so" to those who favored Intel's recommendation to map Intel's RNG directly to /dev/random. Now I hear OpenSSL bypasses /dev/random and /dev/urandom for direct use of an untrusted, possibly NSA backdoored RNG?
                            Last edited by Luke; 07-03-2014, 02:23 PM.

                            Comment


                            • #15
                              Originally posted by Luke View Post
                              I have heard suspicions that Intel may have made it possible to use as nonrandomness in their RNG to export the CPU serial number over SSL
                              I would rather afraid RNG could turn out to be not-so-random. It can follow some function, which is known to NSA but unknown to you. In worst case it can allow third party to re-compute "random" numbers in very narrow time frame. So if you used such "random" numbers for creation of symmetric encryption keys or to derive private key of assymetric encryption, attacker can turn out to be "very lucky" and be able to "guess" your private keys in limited timeframe, using knowledge of possible outputs to reduce attempts amount from impractical full bruteforce down to to limited and practical number of key computation attempts. Once attacker "guessed" correct private key, it is no matter how cool your encryption algo is. This means it is possible to forge certificate, decrypt data, fake host identity, gain unauthorised access via duplicated key and so on.

                              This already happened once, when Debian developer decided to "optimize" some little things around OpenSSL. OpenSSL devs acked it as "safe" change. However, "safe" change led to the fact all private keys generated by such OpenSSL were predictable and there was far less different private keys than it supposed to be. Needless to say, once it has been discovered, it caused urgent need to update and then re-key all debian/ubuntu machines. Because SSH host identity can be forged, someone can use their key to login into host which allows key-based auth, ton of SSL certs were revoked/reissued as possibly troublesome/forgeable, etc. Yet, some people fail to learn from past mistakes and are about to step into similar pitfall with potentially non-random keys one more time.

                              So yes, even Linux kernel devs are aware it is not a good idea to use hardware RHG directly. Its not a big deal to mix such RNG with other entropy sources. In worst case entropy would not improve, in best case it would. Not a big deal, whatever, as you lose nothing even in worst case. But if you use it exclusively, its all-or-nothing. Which isn't fun at all. Somehow even Linux devs can understand it while not being crypto gurus. OTOH OpenSSL guys are not aware of such simple fact when they're creating cryptographic libs! This is EPIC FAIL to say the least. So, overcomplicated protocol with ton of legacy cruft, incompetent devs, huge lib with strange API, which takes degree in rocket science to use it in secure ways... do you honestly think this crap could be secure?

                              Comment

                              Working...
                              X