Announcement

Collapse
No announcement yet.

Libbeauty: Another Reverse-Engineering Tool

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Libbeauty: Another Reverse-Engineering Tool

    Phoronix: Libbeauty: Another Reverse-Engineering Tool

    Libbeauty is another open-source decompiler and reverse-engineering tool...

    http://www.phoronix.com/vr.php?view=MTU1MTU

  • #2
    This sounds like fun.

    Comment


    • #3
      That could prove very useful for a variety of things. I'd love to see if this is more useful than ordinary means of reverse engineering various things like proprietary streaming formats, network protocols- hey, maybe even things like Windows APIs for WINE? Well, maybe that's going too far, but this definitely seems interesting.

      Comment


      • #4
        Originally posted by scionicspectre View Post
        That could prove very useful for a variety of things. I'd love to see if this is more useful than ordinary means of reverse engineering various things like proprietary streaming formats, network protocols- hey, maybe even things like Windows APIs for WINE? Well, maybe that's going too far, but this definitely seems interesting.
        That hardly sounds like clean room RE, so I don't think WINE will use it, at least not for taking code directly (but for documentation, if a third party volunteers to write it after looking at the generated code, maybe, although there are source code available under different programs that could be documented). But for abandoned game engines, I don't see any company suing. I can think of a few community patches that could benefit with this kind of tool.

        Comment


        • #5
          It will bring many sues like oohoppla this company has used my GPL code and alike...funny times ahead indeed.

          Comment


          • #6
            Originally posted by mrugiero View Post
            That hardly sounds like clean room RE, so I don't think WINE will use it, at least not for taking code directly (but for documentation, if a third party volunteers to write it after looking at the generated code, maybe, although there are source code available under different programs that could be documented). But for abandoned game engines, I don't see any company suing. I can think of a few community patches that could benefit with this kind of tool.
            Who do you think could benefit from that kind of software in the open source community?

            The author started this program in order to reverse engineer some x86 driver blob for a TV card he was using (If I remember correctly). He uses some clever tricks to allow supporting different input architectures (by translating the input to an intermediate machine code representation). If I'm correct, he was quite close to having useable C output already.

            I talked to him on several occasions, having started a project similar in nature, to reverse engineer pieces of nvidia firmware.
            My own project lives at https://github.com/rhn/edeco and although it doesn't output C-code yet, it has other interesting features, like support of incomplete ISAs and backend stubs for x86, xtensa and some other lesser-known architectures. It also works in a different way, as it does its best to rebuild the nesting structure of underlying code. In theory, that should allow me to eventually support output in *any* C-like language with a simple extension.

            Unfortunately, it wasn't mature enough to be useful when it was neaded for nouveau devs, so I got distracted with other things That's why I'm interested who else might want it.

            Comment


            • #7
              Originally posted by rhn_mk1 View Post
              Who do you think could benefit from that kind of software in the open source community?
              IMO, mostly people working on projects related to either hardware where the manufacturer doesn't care about your platform, or working with software the authors don't care anymore. For example, the NMA Fallout community, who fixes most of the bugs in the older Fallout games, will greatly benefit from having C code to fix the engine, as those are the only bugs they are unable (to some degree, they fixed a lot of them, though) to fix.
              For example, I'm mostly incompetent in reverse engineering and assembly language (I mean, I kind of skimmed through the basics, but I wouldn't be able to read real life code), but I'm fluent in C, to the point of being able to document it and, after I do that, fix bugs. It takes me quite some time some times, but I'm able to do that.
              Another example would be SCUMMVM (actually, there is an engine probably nobody else cares about, that I'd like to play with, but again, I'm incompetent in reverse engineering and I only have binaries for it). Most of the games it supports are kind of abandonware. They are still on sale (so, if you want to play them, you should buy them), but the authors doesn't care about which code makes them move anymore, as long as you buy the game.

              There is always a slight chance of someone out of their minds suing you, but compared to using it with any current MS product those chances are really marginal.

              Comment


              • #8
                Anyone want to bet that libbeauty will be removed from the face of the earth within a year? I certainly hope not, but it'll definitely be an interesting situation.

                Comment


                • #9
                  I don't see why ayone would have grounds for suing the maker of such software. Programs like that already exist, for example the Hex-Rays Decompiler and IDA Pro.
                  https://www.hex-rays.com/products/de...er/index.shtml

                  These programs don't produce C-code that's immediately useful to a programmer. A *lot* of information is lost when compiling and it's impossible to recover later. The output has no variable names, the structures used in it may be incomplete or split up, there might be a lot of GOTOs and awkward constructs -- especially if what you're trying to decompile was not written in C.

                  I don't see anyone using a decompiled piece of C code and just putting it in their program. This would be just marginally more useful than ripping the assembly or even binary code and including it. Therefore I don't think someone could sue the author saying "hey, I saw someone use your program to copy my code, prepare to die".

                  I think Hex-Rays Decompiler doesn't even produce pure/correct C output...

                  Comment


                  • #10
                    Originally posted by rhn_mk1 View Post
                    I don't see why ayone would have grounds for suing the maker of such software. Programs like that already exist, for example the Hex-Rays Decompiler and IDA Pro.
                    https://www.hex-rays.com/products/de...er/index.shtml

                    These programs don't produce C-code that's immediately useful to a programmer. A *lot* of information is lost when compiling and it's impossible to recover later. The output has no variable names, the structures used in it may be incomplete or split up, there might be a lot of GOTOs and awkward constructs -- especially if what you're trying to decompile was not written in C.

                    I don't see anyone using a decompiled piece of C code and just putting it in their program. This would be just marginally more useful than ripping the assembly or even binary code and including it. Therefore I don't think someone could sue the author saying "hey, I saw someone use your program to copy my code, prepare to die".

                    I think Hex-Rays Decompiler doesn't even produce pure/correct C output...
                    But if the code was generated by LLVM, wouldn't it be then possible to recover some info from decompilation?

                    Comment


                    • #11
                      Originally posted by DeepDayze View Post
                      But if the code was generated by LLVM, wouldn't it be then possible to recover some info from decompilation?
                      No. At most, it could retain a more accurate representation of the control flow (as applying the inverse process of how LLVM treats them should suffice, reducing the degree of heuristics needed), but local variables' names are lost when compiling independently of the compiler used. Only debug build retain those.
                      Decompilation serves mostly to have something to analyze. Building will likely not work without some analysis, as it probably couldn't guess included headers, and for bugfixing you need to know what you are looking for and to understand the code.

                      Comment


                      • #12
                        This would be excellent. A free decompiler that produced C. Even without loops and with awkward gotos, it would be much better than just pure disasm.

                        I should also note that last I checked, in the EU you're legally allowed to RE for interoperability purposes using any means necessary. So if my memory is correct, clean room is not needed here.

                        Comment


                        • #13
                          Originally posted by curaga View Post
                          I should also note that last I checked, in the EU you're legally allowed to RE for interoperability purposes using any means necessary. So if my memory is correct, clean room is not needed here.
                          Indeed. But other jurisdiction might require clean room.

                          so the usual workflow is:
                          - take developers in a RE-friendly country (Russia is an example)
                          - RE the fuck of your target using any possible mean (decompiler such a libbeauty, for exemple)
                          - use this code to analyse the workflow
                          - use pieces of the code to compile small proof-of-concepts test (see the few opensource skype project in russia)
                          - document how these tests are working.

                          - take a second team of developer.
                          - have the developer read the documentation produced by the precedent team (but do not read the actual decompiled code to avoid tainting)
                          - have the developer try to code their own re-implementation from scratch of the same functionality.
                          (add in some exchange between the two team to make clear points where documentation is lacking, isn't clear or is ambigous)
                          - now you have your own new implementation, that you can release worldwide as GPL or BSD licensed code.
                          (Note: beware of patents. Eventually try implementing the same functionality using a different approach: It helps if patented algorithm is a special case of a more generic approach which wasn't patented. e.g.: the patented arithmetic coding, is a special type of the unpatened range coding, where the range is define as 0:1 using real numbers)


                          Speaking of decompilation, this brings fonds memory of oldschool assembly decompilation of DOS-era, which not only tried to put assembler mneumonic to machine code (like any debugger does) but also tried to track memory location, but even tracked the exact API used (INT calls, like INT 21h, INT 10h, INT 13h, and such) the ports used (and recognised quite a bunch of hardware components) and tried to put useful comments and meaningful variable name. I managed to learn outputting WAV sample to the PC Speaker simply by analyzing such RE (the comments where that much useful)

                          Perhaps, with some API tracking, libbeauty could manage some of the same.
                          (recognise some variable depending on the API where they are used. e.g.: "char *format" instead of "char *str_1398" if that string pointer is used as a format in subsequent fprintf calls).

                          Comment

                          Working...
                          X