Announcement

Collapse
No announcement yet.

LZO & LZ4 Security Vulnerabilities Disclosed

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by 0xBADCODE View Post
    ... And then, OpenSSL is written by people who are not security experts at all. ...
    Spending a big amount of time working on a crypto library kind of makes you an expert:-) Maybe not a good programmer, but an expert non the less.

    Originally posted by 0xBADCODE View Post
    Fortunately, LZ-based algos are different. They are simple. That's what makes its really uncommon and noteworthy to encounter bugs in LZ stream parsing. Even more unusual if they're lurking around for 20 years.
    That was no bug 20 years ago. Back then it was a "32bit int is enough here as that can not overflow with the allowed block sizes". In fact I have trouble calling this a bug today.

    Originally posted by 0xBADCODE View Post
    Interestingly, it haves nothing to do with RAM usage.
    Oh, it does. The problem is a 32bit int, a fix is using a 64bit int. That is twice the size:-)

    Originally posted by 0xBADCODE View Post
    Bug mostly comes to insufficient input data validation. Fix wouldn't increase RAM usage.
    A block size > 8MiB is not supported and nobody mentioned any implementation that does not do so, so where is the input validation missing? Yes, theoretically there may be some, and that is why upstream improved the library to not fail in such cases anymore. Great, let's move on.

    Originally posted by 0xBADCODE View Post
    The problem is different. Earlier, many computers were not directly connected to nets or had limited connectivity. Data sets they were processing were mostly fixed and came from (more or less) trusted sources. So devs can "just write code". Code had to implement desired logic. Without bothering self about corner cases and strange input. Now everything is networked, humans exchange huge amounts of data. And there is trouble on the way: these data can't be trusted anymore...
    I do consider LZO/LZ4 to be a good example of making sure the input is sane, considering that I did not see anybody showing a concrete implementation that is broken.

    Originally posted by 0xBADCODE View Post
    These days world is full of automatic activity seeking for free resources and data to steal. Just set up your web server or read sshd logs. You will see what I mean. These things are working without sleep and rest. They know no mercy. You and your computer are just some free resources and valuable data for them. So every bug which can be abused, will be abused to get even more free resources and valuable data. This seriously increases requirements for external data validation and ability of programs to deal with uncommon corner cases.
    Right, but what does this have to do with this issue? You have to get a user to execute a vulnerable piece of code, which you most likely need to write yourself since nobody mentioned broken users of LZO/LZ4 yet. If your users run code random people mail to them, then LZO/LZ4 is the least thing you need to worry about.

    Comment


    • #12
      Originally posted by Karl Napf View Post
      Right, but what does this have to do with this issue? You have to get a user to execute a vulnerable piece of code, which you most likely need to write yourself since nobody mentioned broken users of LZO/LZ4 yet. If your users run code random people mail to them, then LZO/LZ4 is the least thing you need to worry about.
      Fundamentally:
      - LZO and LZ4 are compression methods highly tuned for compression/decompression. They are designed to go as fast as possible. As such they are often used locally when speed matters the most. Like real-time filesystem compression. Like cache & RAM compression. Like very fast boot-loaders in embed machines.

      - The web is about trying to save bandwidth. Sacrifice as much computing powers, if afterwards you data is smaller and transfered using fewer bytes. Things like 7z, Advzip, or even Zopfli exist which are slow as hell. But then you can shave a few byte, which once you factor in how many time this piece of data will get downloaded, will cost less data transfers. The dominant lossless algorithms are Deflate (for ubiquity) and LZMA (best general purpose contender).


      You're not going to find LZ4 or LZO on the web much. They trade compression performance against speed. They run much more faster than Deflate, at the cost of worse compression. That's totally opposed to web principles of data-volume economy.
      As a result you don't find LZ4/LZO part of any standard. They are simply not exposed to the web.

      - Yes, in theory, you could do remote execution using the bug.
      - In practice, you won't be able to find an interface to send your exploit to. The attack surface is very limited.

      The only decompression algorithms to which you could send malicious data is Deflate. (data-link compression, png, etc. lots of parts of the web use it).
      LZ4 LZO is inaccessible from the outside as they stand between the kernel and real-time transparent file-system compression (and if an attacker is able to corrupt the way data is physically stored on a disk partition, you already have bigger problems).

      Comment


      • #13
        Originally posted by Karl Napf View Post
        Spending a big amount of time working on a crypto library kind of makes you an expert:-) Maybe not a good programmer, but an expert non the less.
        Wrong. It is your knowledge and skills what makes you an expert in some area. In cryptography and security it also takes some special way of thinking: good cryptography experts and security gurus should be paranoid enough to be able to avoid common pitfalls. Unfortunately, OpenSSL devs proven they lack this skill, which is absolutely mandatory if you're about doing something security-sensitive. It is not even about heartbleed, it is about overall project management and how they deal with some "potentially unsafe areas". As a blatant example of incompetence: if OpenSSL asked for hardware crypto acceleration, OpenSSL is dumbass enough to use hardware RNG directly. No, they do not mix hardware RNG with other entropy sources. They just take RNG output and directly fed it to apps using OpenSSL. Should there be backdoor in RNG, so it not as random as it seems, all generated keys will be predictable. Just as it happened once in Debian already due to "small optimization", which caused need to urgently rekey thousands of machines. It is so lame and blatant incompetence, that even just good devs who haven't lost their ability to think are able to pinpoint it. For example, Linux kernel devs were scared by whole idea to use just hardware RNG as single entropy source. Let's say they do not even call themselves experts in security and crypto. But are able to understand such basic things, unlike "experts" from OpenSSL. And its not joke. Grab recent source of Tor and read changelog: Tor is security sensitive stuff for sure. So Tor devs were forced to use own workarounds to fix OpenSSL idiocy. Cool, isn't it? Do you honestly things such solution from "experts" could be secure at all? Bah, that's very unlikely. Complicated protocol? Tons of legacy cruft? Lame devs? Set Sail for Fail.

        That was no bug 20 years ago. Back then it was a "32bit int is enough here as that can not overflow with the allowed block sizes". In fact I have trouble calling this a bug today.
        You see, it only takes 16MiB input data to cause wraparound. Let's say, by modern standards, 16Mb chunk of data isn't something terribly large. This in turn enough to wrap-around 32-bit register.

        Oh, it does. The problem is a 32bit int, a fix is using a 64bit int. That is twice the size:-)
        It is twice in size bitwise bit cam hold 2^32 times larger numbers. Needless to say, 4G * 16M input is waaaaaaaay larger thing and never used and even troublesome to transmit at all.

        A block size > 8MiB is not supported and nobody mentioned any implementation that does not do so,
        Once again, decompression algo is one thing and format used to deliver data is another thing. So it is completely valid idea to ask decompression engine to decompress that 16Mb chunk. But it is not valid if things will go boom instead. Decompression engine like this supposed to be generic enough.

        so where is the input validation missing?
        In decompression engine - LZ stream parsing failed to evaluate what happens if there will be more than 16M of specially crafted data.

        I do consider LZO/LZ4 to be a good example of making sure the input is sane, considering that I did not see anybody showing a concrete implementation that is broken.
        It's rather really noteworthy example how "better save than sorry" principle has been disregarded and forced some people to get nervous (ffmpeg/libav even released patches to plug hole if I remember correctly).

        Code:
        a vulnerable piece of code, which you most likely need to write yourself since nobody mentioned broken users of LZO/LZ4 yet.
        From what I remember, ffmpeg has been potentially affected by LZO issue. While it not easy to exploit in practical ways, they had to release patches.

        If your users run code random people mail to them, then LZO/LZ4 is the least thing you need to worry about.
        Sure, but if you just write some code dealing with compression of data, last thing you may want to encounter is unexpected failure in decompression engine.

        Comment


        • #14
          That's beyond ugly and invokes a possible NSA risk

          Originally posted by 0xBADCODE View Post
          As a blatant example of incompetence: if OpenSSL asked for hardware crypto acceleration, OpenSSL is dumbass enough to use hardware RNG directly. No, they do not mix hardware RNG with other entropy sources. They just take RNG output and directly fed it to apps using OpenSSL. Should there be backdoor in RNG, so it not as random as it seems, all generated keys will be predictable.
          I have heard suspicions that Intel may have made it possible to use as nonrandomness in their RNG to export the CPU serial number over SSL. That, in turn, allows direct identification of a user who posted something they thought was private, if they used a credit card to buy their computer and the store recorded a computer serial number. If the computer was bought with cash and Windows never activated it would still permit identification after the fact if the poster is arrested with the computer, assuming the case is hot enough for the NSA to be willing to admit to this in open court.

          This makes OpenSSL the worst possible place to use the RNG directly, and suggests blacklisting AES-NI to prevent such an export. There are many good reasons /dev/random and /dev/urandom XOR the RNG output with the traditional kernel RNG, and after the Snowden revelations many said "I told you so" to those who favored Intel's recommendation to map Intel's RNG directly to /dev/random. Now I hear OpenSSL bypasses /dev/random and /dev/urandom for direct use of an untrusted, possibly NSA backdoored RNG?
          Last edited by Luke; 03 July 2014, 02:23 PM.

          Comment


          • #15
            Originally posted by Luke View Post
            I have heard suspicions that Intel may have made it possible to use as nonrandomness in their RNG to export the CPU serial number over SSL
            I would rather afraid RNG could turn out to be not-so-random. It can follow some function, which is known to NSA but unknown to you. In worst case it can allow third party to re-compute "random" numbers in very narrow time frame. So if you used such "random" numbers for creation of symmetric encryption keys or to derive private key of assymetric encryption, attacker can turn out to be "very lucky" and be able to "guess" your private keys in limited timeframe, using knowledge of possible outputs to reduce attempts amount from impractical full bruteforce down to to limited and practical number of key computation attempts. Once attacker "guessed" correct private key, it is no matter how cool your encryption algo is. This means it is possible to forge certificate, decrypt data, fake host identity, gain unauthorised access via duplicated key and so on.

            This already happened once, when Debian developer decided to "optimize" some little things around OpenSSL. OpenSSL devs acked it as "safe" change. However, "safe" change led to the fact all private keys generated by such OpenSSL were predictable and there was far less different private keys than it supposed to be. Needless to say, once it has been discovered, it caused urgent need to update and then re-key all debian/ubuntu machines. Because SSH host identity can be forged, someone can use their key to login into host which allows key-based auth, ton of SSL certs were revoked/reissued as possibly troublesome/forgeable, etc. Yet, some people fail to learn from past mistakes and are about to step into similar pitfall with potentially non-random keys one more time.

            So yes, even Linux kernel devs are aware it is not a good idea to use hardware RHG directly. Its not a big deal to mix such RNG with other entropy sources. In worst case entropy would not improve, in best case it would. Not a big deal, whatever, as you lose nothing even in worst case. But if you use it exclusively, its all-or-nothing. Which isn't fun at all. Somehow even Linux devs can understand it while not being crypto gurus. OTOH OpenSSL guys are not aware of such simple fact when they're creating cryptographic libs! This is EPIC FAIL to say the least. So, overcomplicated protocol with ton of legacy cruft, incompetent devs, huge lib with strange API, which takes degree in rocket science to use it in secure ways... do you honestly think this crap could be secure?

            Comment


            • #16
              Originally posted by 0xBADCODE View Post
              I would rather afraid RNG could turn out to be not-so-random.
              Hehe, funny you wrote that. Back when this thread was active, I wanted to write more or less the same kind of answer (including the same debian example), but did lose my draft after a connection crash. Good that you wrote this answer.

              In short: risk that the RNG exposes the unique identifier <<< risk that the RNG isn't that much random but is predictible.

              In the first case, well an eavesdropper could get a bit information about you. Motivated hackers at several website, if they coordinate among them, could assert that several transaction are done presumably by the the same person (or at least by the same computer).

              In the second case, your privacy and identity are pretty much completely hosed. Anyone in the know about how the RNG is less random can brute force any randomly generated key you might have (and the DSA family of keys and algo are awefully sensitive to the quality of randomness) and can impersonate you, decrypt all your communications (including ephemeral), etc. Basically, you're owned to the bone and all your base all belong to the NSA, the FSB and the MSS.

              Comment

              Working...
              X