Announcement

Collapse
No announcement yet.

PathScale Open-Sources The EKOPath 4 Compiler Suite

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Mesa

    I'm guessing Mesa won't benefit too much from this, although it would be nice to be proved wrong.

    But a lot of the performance sensitive code is already being generated by LLVM, and like hand-written assembly that won't be touched by compiling Mesa with a different compiler.

    Perhaps the Intel driver, especially on hardware without T&L support, would see a nice boost. Or perhaps the shader compiler performance?

    Comment


    • #62
      std::map and std::set performance worse

      I have run some tests which show that std::map and std::set performance of EKOPath is worse than gcc. The following test runs in 8.8s when compiled with gcc, but takes 11.7s compiled with EKOPath: http://pastebin.com/YhXxb3km

      Comment


      • #63
        Originally posted by ChrisXY View Post
        What I wonder is:

        Where are the actual speedups in ekopath? I mean, if they are in the backend - could they not be ported to gcc or even partly replace the gcc backend?
        Sure you can port optimizations, but not every ball of yarn is created equal. Compilers are generally speaking an extremely complicated piece of software and the effectiveness of optimizations can be dependent on many things. EKOPath has been engineered for performance and we're in a good position to stay ahead in areas we focus on.

        Comment


        • #64
          Huge thanks to Pathscale for open sourcing this. I was thinking the benchmark results indicated this was related to gpgpu, but seeing they're 'only' cpu based makes them very impressive indeed.

          Judging by some of the (admittedly limited) benchmarks available it could certainly make a huge difference in alot of heavily computational tasks (compressing, encoding, rendering, emulation etc).

          In one word, AWESOME!

          Comment


          • #65
            Originally posted by zester View Post
            git clone git://github.com/path64/compiler.git
            This isn't all of it and more sources will be available soon

            Comment


            • #66
              Hmm... First test run the compiler crashes.

              Code:
              /<snip>/bin/pathcc -march=native -DC99_INLINE -c -o target.o target.c
              Signal: Segmentation fault in Lightweight Inliner phase.
              Error: Signal Segmentation fault in phase Lightweight Inliner -- processing aborted
              *** Internal stack backtrace:
                  /<snip>/lib/4.0.10/x8664/inline() [0x4aec5e]
                  /<snip>/lib/4.0.10/x8664/inline() [0x4aeab1]
                  /<snip>/lib/4.0.10/x8664/inline() [0x4ada27]
                  /<snip>/lib/4.0.10/x8664/inline(ErrMsg_Report+0x44) [0x4afd7c]
                  /<snip>/lib/4.0.10/x8664/inline(ErrMsgLine+0xb1) [0x4afc77]
                  /<snip>/lib/4.0.10/x8664/inline() [0x4aeec0]
                  /lib/x86_64-linux-gnu/libc.so.6(+0x33d80) [0x2b7349a4fd80]
                  /<snip>/lib/4.0.10/x8664/be.so(+0x49fbaf) [0x2b7348e7fbaf]
                  /<snip>/lib/4.0.10/x8664/be.so(ErrMsg_Report+0x28) [0x2b7348e80c94]
                  /<snip>/lib/4.0.10/x8664/be.so(ErrMsg+0xbc) [0x2b7348e80c6a]
                  /<snip>/lib/4.0.10/x8664/be.so(+0x45d494) [0x2b7348e3d494]
                  /<snip>/lib/4.0.10/x8664/be.so(Configure_Target+0x43) [0x2b7348e3d901]
                  /<snip>/lib/4.0.10/x8664/be.so(Configure+0x109) [0x2b7348d4c719]
                  /<snip>/lib/4.0.10/x8664/inline() [0x47876f]
                  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xff) [0x2b7349a3aeff]
                  /<snip>/lib/4.0.10/x8664/inline() [0x427679]
              pathcc INTERNAL ERROR: /<snip>/lib/4.0.10/x8664/inline died due to signal 4
              make: *** [pajek/in.o] Error 1
              It also couldn't compile <errno.h> but once I tweaked my code enough it compiled without -march. The resulting math heavy test case (sorry, can't share code) was 5-6% faster than gcc at both Os and O2, but resulted in 15% larger binaries with -Os.

              Comment


              • #67
                Originally posted by allquixotic View Post
                Actually, I don't think that includes any of their proprietary components. This is just their open source fork of Open64.

                I don't know all the details, but here's my GUESS:

                1. They forked Open64
                2. They made changes to Open64 sources, and had to release those changes to comply with the GPL
                3. They made some additional changes that they deemed "mere aggregation" and were able to keep those proprietary
                4. Since the "path64" github branch has been around a while and hasn't received a commit in 2 days, it's safe to assume that it doesn't include the proprietary addons

                So, likely only the binaries are available for now.
                baaah.. you people drive me crazy about open64!!

                1) We "forked" pro64 like 8 year ago and import nothing from open64
                2) Open64 imported heavily from PathScale tarballs that were released previously so in reality it's a fork from us! (Check their early commit logs to see what I mean)
                3) More "stuff" coming open source and will be available at our pathscale github account. Path64 won't get anymore sources directly added to it.

                Comment


                • #68
                  Originally posted by Otus View Post
                  Hmm... First test run the compiler crashes.

                  Code:
                  /<snip>/bin/pathcc -march=native -DC99_INLINE -c -o target.o target.c
                  Signal: Segmentation fault in Lightweight Inliner phase.
                  Error: Signal Segmentation fault in phase Lightweight Inliner -- processing aborted
                  *** Internal stack backtrace:
                      /<snip>/lib/4.0.10/x8664/inline() [0x4aec5e]
                      /<snip>/lib/4.0.10/x8664/inline() [0x4aeab1]
                      /<snip>/lib/4.0.10/x8664/inline() [0x4ada27]
                      /<snip>/lib/4.0.10/x8664/inline(ErrMsg_Report+0x44) [0x4afd7c]
                      /<snip>/lib/4.0.10/x8664/inline(ErrMsgLine+0xb1) [0x4afc77]
                      /<snip>/lib/4.0.10/x8664/inline() [0x4aeec0]
                      /lib/x86_64-linux-gnu/libc.so.6(+0x33d80) [0x2b7349a4fd80]
                      /<snip>/lib/4.0.10/x8664/be.so(+0x49fbaf) [0x2b7348e7fbaf]
                      /<snip>/lib/4.0.10/x8664/be.so(ErrMsg_Report+0x28) [0x2b7348e80c94]
                      /<snip>/lib/4.0.10/x8664/be.so(ErrMsg+0xbc) [0x2b7348e80c6a]
                      /<snip>/lib/4.0.10/x8664/be.so(+0x45d494) [0x2b7348e3d494]
                      /<snip>/lib/4.0.10/x8664/be.so(Configure_Target+0x43) [0x2b7348e3d901]
                      /<snip>/lib/4.0.10/x8664/be.so(Configure+0x109) [0x2b7348d4c719]
                      /<snip>/lib/4.0.10/x8664/inline() [0x47876f]
                      /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xff) [0x2b7349a3aeff]
                      /<snip>/lib/4.0.10/x8664/inline() [0x427679]
                  pathcc INTERNAL ERROR: /<snip>/lib/4.0.10/x8664/inline died due to signal 4
                  make: *** [pajek/in.o] Error 1
                  It also couldn't compile <errno.h> but once I tweaked my code enough it compiled without -march. The resulting math heavy test case (sorry, can't share code) was 5-6% faster than gcc at both Os and O2, but resulted in 15% larger binaries with -Os.
                  binary size doesn't matter for performance as much as locality. We will work on binary size and it's generally a known issues especially for c++. Our -Os needs significant improvement and don't expect us to win there currently.

                  Try -O3 or -Ofast if the code is loop intensive.

                  Comment


                  • #69
                    Originally posted by XorEaxEax View Post
                    Huge thanks to Pathscale for open sourcing this. I was thinking the benchmark results indicated this was related to gpgpu, but seeing they're 'only' cpu based makes them very impressive indeed.

                    Judging by some of the (admittedly limited) benchmarks available it could certainly make a huge difference in alot of heavily computational tasks (compressing, encoding, rendering, emulation etc).

                    In one word, AWESOME!
                    This release is all CPU and nothing GPU related.

                    Comment


                    • #70
                      This is probably a stupid question but just to be sure - what exactly was compared in the tests? A benchmark program A compiled with GCC against the same benchmark program A compiled with PathScale, with the rest of the system being exactly same?

                      Comment


                      • #71
                        Originally posted by Pahanilmanlintu View Post
                        This is probably a stupid question but just to be sure - what exactly was compared in the tests? A benchmark program A compiled with GCC against the same benchmark program A compiled with PathScale, with the rest of the system being exactly same?
                        Yes, including maintaining all of the same compile flags.
                        Michael Larabel
                        http://www.michaellarabel.com/

                        Comment


                        • #72
                          Another question...

                          So improvement in speed wont be that magnificant with an old CPU, right?
                          But nontheless there should be improvement in comutational apllications?
                          So, what about archiving tools, like rar, 7z etc...

                          I got a Laptop here, running (nearly with light speed ) with a "AMD Turion(tm) 64 Mobile Technology ML-37 (1 cpu cores)" @2GHz (2 gigs RAM).

                          Extracting huge archives takes up time time time time.... Is improvement 2 be expected just for compression? or rather decompression? or both?
                          As I'm really working with that laptop even 5% (thinking of like 20 minutes for extracting bigger archives) would be fantastic!

                          Thanks in advance and thank you Phoronix for the wonderful time i was able to spend while enjoying your articles and thank you PathScale as well for your great decision!
                          Last edited by mastah; 06-13-2011, 03:17 PM.

                          Comment


                          • #73
                            Hello,

                            I tested the bechmark given above, testing the STL (http://pastebin.com/YhXxb3km), and I found how to make EKOPath faster than Clang and GCC :

                            1: g++ -O3 -fwhopr -o test test.cpp
                            2: clang++ -O3 -o test test.cpp
                            3: .../pathcc -Ofast -c -o test.o test.cpp -IPA -apo && .../pathcc -o test test.o .../lib/4.10.0/x8664/64/*.a -lpthread -ldl -ipa
                            4: .../pathcc -Ofast -c -o test.o test.cpp -IPA -apo -stl_not_threadsafe && .../pathcc -o test test.o .../lib/4.10.0/x8664/64/*.a -lpthread -ldl -ipa

                            The EKOPath command is split because I had to do that in order to have the program compiling.

                            And the results :

                            1: 20.080s
                            2: 21.652s
                            3: 19.989s
                            4: 17.720s (the fastest)

                            We can see that disabling the STL thread-safety can have a good impact for single-threaded programs, but that even without this option EKOPath is a little bit faster (but slower if we don't use its advanced options like interprocedural optimization (equivalent of GCC's LTO) and vectorization).

                            A good compiler (event if it produces very long assembly code full of rare instructions like ldmxsr, fldcw and fnstcw that I never encountered before.

                            Comment


                            • #74
                              Should I expect a speedup with C++ as well? I tried the compiler and my path tracer took almost twice as long with ekopath (-Ofast) than with GCC (-O2) or ICC (no extra flags). Nice to see it go open source though.

                              Comment


                              • #75
                                Originally posted by marwi509 View Post
                                Should I expect a speedup with C++ as well? I tried the compiler and my path tracer took almost twice as long with ekopath (-Ofast) than with GCC (-O2) or ICC (no extra flags). Nice to see it go open source though.
                                -Ofast isn't always the best flag. Also did you mean compile time or resulting binary speed? Please follow-up with support@pathscale.com to give more details. Compile time performance is known and we're working on it.

                                Comment

                                Working...
                                X