Announcement

Collapse
No announcement yet.

AMD FX-8150 With The Open64 5.0 Compiler

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AMD FX-8150 With The Open64 5.0 Compiler

    Phoronix: AMD FX-8150 With The Open64 5.0 Compiler

    The Open64 5.0 compiler was released earlier this month with many changes, among the prominently noted items were greater optimizations for AMD's Bulldozer CPUs. In this article is a first-look at the Open64 5.0 compiler performance compared to its earlier release, as tested on an AMD FX-8150 eight-core "Bulldozer" processor.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Up to 30% improvement!

    Up to 30% improvement!

    Would be interesting to see if the new compiler improves for Intel CPUs in that benchmark too.

    Comment


    • #3
      Not much improvement for Pov-Ray floating point work..

      A lot of the whining against Bulldozer has been about it's lack of floating point performance from Windows users. Of course, with technologies like OpenCL, it's far better to be doing floating point math on GPUs than CPUs since they're over 1000x faster at it. A lot of game companies still do WAYYY too much floating point work on the CPU. Games such as Bad Company 2 and the like do all their physics calculations on the CPU when they really should be done on the GPU, since the GPUs are designed for those types of massively parallel floating point calculations and CPUs really aren't.

      Thankfully there are physics engines out such as Havok that run under OpenCL, which means game companies don't have any excuse to continue using the CPU for so much floating point work. Which I think makes these Bulldozer chips a good choice long-term since they can beat Intel's more expensive chips in integer performance. Though of course, the open source linux drivers are a long way away from supporting OpenCL, though not many serious gamers (Crysis 3, Battlefield 3, etc) run those drivers anyway. OpenCL is going to become much more important in the future as AMD is shifting the focus of floating point away from their CPU cores and towards their APU / GPU cores which run OpenCL for floating point work..
      Last edited by Sidicas; 25 November 2011, 08:36 AM.

      Comment


      • #4
        Thankfully there are physics engines out such as Havok that run under OpenCL, which means game companies don't have any excuse to continue using the CPU for so much floating point work.
        Hm, I thought "available market" is a good excuse. If your minimum requirements are mid-range HD6k or GTX4xx card, that's cutting out a lot of people.

        Comment


        • #5
          Originally posted by sabriah View Post
          Up to 30% improvement!

          Would be interesting to see if the new compiler improves for Intel CPUs in that benchmark too.
          That's exactly the problem with almost all benchmarking sites, Phoronix not withstanding.

          Tom's hardware concluded that 6 core Sandy Bridge-E was 30% faster than the FX-8150. How much of that came down to compiler optimizations, especially since a rather suspicious number of benchmarks are compiled with Intel's own ICC compiler? There is no mainstream compiler that is AMD-biased to balance out the results. Of the 30 benchmarks, the average user will use between 0 and 3 of those applications in real life, but yet they will be "recommended" to buy the Intel CPU based on a useless aggregate score that is distorted by synthetic benchmarks like Futuremark which always favor Intel by an unrealistic amount compared to real life.

          30% isn't that big of a difference in real life anyways(assuming it's even really 30%), especially since most CPUs are in idle/power-saving mode most of their lifetime. If AMD would market Bulldozer as a quad-core with superior hyperthreading, then it suddenly becomes the world's fastest consumer-grade CPU, since that 6 core SB-E CPU requires 30% more die size, 2 more cores, and costs 4x as much to acheive only 30% more performance.

          *Posted from my screaming fast FX-8120*

          Comment


          • #6
            Offloading FP to GPU should mean lower prices than current CPUs

            Offloading FP to GPU should mean lower prices than current CPUs since it means the CPU was designed to do less than current CPUs.

            One thing not clear from traditional benchmarks is what are the capabilities of the CPUs and then test those capabilities. That way you get an idea of what you are paying for rather than what the quality of software X is. Software X may be written by a crappy programmer.

            Has BD 8150 been compared to previous Phenom IIs? That way we would have an idea of performance compared to past CPUs and if the price is justified.

            Comment


            • #7
              Originally posted by linux5850 View Post
              Has BD 8150 been compared to previous Phenom IIs? That way we would have an idea of performance compared to past CPUs and if the price is justified.
              Yup, and unless you are doing something like cryptography you are better off with an X6 at this point.

              Comment


              • #8
                Originally posted by deanjo View Post
                Yup, and unless you are doing something like cryptography you are better off with an X6 at this point.
                The comparisons that have been done against the Phenom IIs have used applications compiled without bulldozer optimizations and under an OS with a thread scheduler that doesn't understand the modular design (2 cores per module with some shared resources) of Bulldozer...

                So no, the comparisons against the Phenom II aren't fair at all..

                Comment


                • #9
                  Originally posted by sabriah View Post
                  Up to 30% improvement!

                  Would be interesting to see if the new compiler improves for Intel CPUs in that benchmark too.
                  If you compile those applications with the Bulldozer optimizations, the compiled binaries don't run on Intel CPUs.. So I don't see how such a comparison could be made. I think there is a way to compile a binary so that it only enables the Bulldozer optimizations if you have a Bulldozer CPU, but I'm not sure if Open64 does this and what options need to be set to do it.. But even in that situation, it would mean the Intel chips don't get *ANY* of the Bulldozer optimizations anyway. So I'd say it's a pretty safe bet that all the Bulldozer optimizations are only applicable to Bulldozer CPUs and would not help Intel CPUs at all since the Intel CPUs currently don't even support FMA3 (coming 2013 for Intel), let alone FMA4.

                  Keep in mind, Bulldozer runs FMA4 while future Intel CPUs will run FMA3.. They're mutually exclusive though I'm sure there are some tricks in there to get a binary to run FMA4 on Bulldozer CPUs and FMA3 on Intel CPUs (different compiled paths).. Certainly a lot of the performance boosts in this new Open64 compiler revolve around using FMA4. It's the only compiler out there besides GCC that has FMA4 accelerations on the drawing board.
                  Last edited by Sidicas; 25 November 2011, 10:42 PM.

                  Comment


                  • #10
                    Originally posted by deanjo View Post
                    Yup, and unless you are doing something like cryptography you are better off with an X6 at this point.
                    Wow, what a sweeping generalization... and a very misleading one at that.

                    How about something like:

                    "Unless you're building a PC to run the Cinnebench single threaded benchmark, you're better off with a Core2 Duo."

                    Bulldozer did have some regressions, mostly in single threaded benchmarks. However, it's also faster than the Phenom II X6 in many single threaded benchmarks, and almost universally faster in well threaded benchmarks.

                    I'm posting from an FX8120, and it feels faster than any Sandy Bridge, Nehalem or Phenom II I've ever used. I have the following windows open:

                    Eclipse(EPIC-Perl)
                    Netbeans(PHP)
                    Firefox
                    A Virtualbox VM running an Apache/PHP/Postgresql test server
                    A Virtualbox VM running a SVN server
                    PGAdmin3
                    Several terminals
                    Gedit
                    ...and a few more random windows

                    , and not ever a hint of lag, despite running 2 craptastic Java-based IDEs at the same time. I can even do something CPU intensive like creating a Truecrypt volume or compiling the Linux kernel, and still no slowdown whatsoever. I hate to break it to you, but a quad core Sandy Bridge cannot do all of those things and still be perfectly responsive, especially if you're using it's IGP.

                    Comment

                    Working...
                    X