Announcement

Collapse
No announcement yet.

AMD FX-8150 With The Open64 5.0 Compiler

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bachinchi
    replied
    Originally posted by Sidicas View Post
    The Bulldozer chips do very well against the Intel chips in integer performance... You certainly get your money there. Things like compiling apps and general desktop multitasking is helped a lot by integer performance and it's why Intel tries to cram Hyperthreading into their CPUs wherever they can (ie: LGA-2011). You can see that in Dhrystone benchmarks done by every review site.. The Bulldozer FX-8150 chips are no more than 10% slower in Integer performance than a 2600k and the 2600K is 18% more expensive ($50).


    What most review sites show as Bulldozer lacking on is it's floating point performance. Bulldozer is a bit weak in floating point because of the shared FPUs. To make up for the shortcoming in floating point, AMD built these Bulldozer chips to support FMA4 accelerations, an optimization that isn't available until apps are compiled with such optimizations.. Those benchmarks that show Phenom II being anywhere near the performance of Bulldozer is because they're running apps that haven't been compiled with FMA4 accelerations, it's as simple as that. When FMA4 accelerations is compiled into the binary, floating point on Bulldozer goes up a solid 30% across the board leaving Phenom II CPUs a long way behind.. You're not always going to see that 30% comparing the compiled binaries of Open64 v5 to Open64 v4, but they show up when you compare Open64v4 to -O2 GCC or Open64v5 to -O2 GCC for floating point apps. In Open64v4, Pov-Ray jumped up a solid 30% and in Open64v5 you can see some other floating point apps that didn't jump up 30% in Open64v4 to get their 30% boost in floating point performance in Open64v5 instead.

    If people want to pay 18% more money for <10% more performance, then that's up to them. Intel has been targeting the enthusiast market for a long time and they continue to do so. With AMD, you continue to get more bang for your buck, as has always been true for many years. You might have to jump through a hoop or two to get that floating point performance on these Bulldozer chips up (recompiling floating point heavy apps with FMA4 accelerations), but really there's not much there to argue about. Especially considering with OpenCL, floating point apps should be pushing their floating point work to the GPU as it's over 1000x faster at it. Even the Fusion Integrated GPUs (ie: Radeon 6550D) are dozens of times faster at floating point than the fastest $999 Intel CPUs are. The days of doing floating point on the CPU are coming to an end. A modern GPU has got several hundred shader "cores" that can all process floating point calculations in parallel, there's no reason to run them on a 4 or even 8 core CPU.
    That's so true. But we won't see those optimizations soon I guess. I really hope the next version of GCC (and other compilers, including VS) include better optimizations for Bulldozer.

    Leave a comment:


  • WillyThePimp
    replied
    Originally posted by leeenux View Post

    No, the point was that it's not that hard to make all of your real and fake threads hit 100%. I think for your testing to have any real value, you'd have to try it on something other than the perpetually-broken KDE. However, I do stand by my point, I'd rather have 8 real-ish cores than 4 real and hyperthreading.
    Proven lower-performing and power hungrier 8 cores better than 4 energy efficient and much faster ones? What kind of sorcery is this?

    Leave a comment:


  • Tgui
    replied
    Originally posted by leeenux View Post
    Fair enough. It looks like the Core i7's get VT-X and VT-d, and that the Core i5's get just VT-X. Still, AMD gives you full virtualization acceleration up and down all of their products lines, not just at CPUs north of $200.
    Are you capable of doing any basic research on your part before joining a discussion?

    Core i5 2400, VT-x VT-d
    http://ark.intel.com/products/52207/...he-3_10-GHz%29

    Leave a comment:


  • leeenux
    replied
    Originally posted by schmalzler View Post
    This is plain wrong!
    http://ark.intel.com/products/52213/...he-3_40-GHz%29
    Look down to the "Advanced Technologies".
    Fair enough. It looks like the Core i7's get VT-X and VT-d, and that the Core i5's get just VT-X. Still, AMD gives you full virtualization acceleration up and down all of their products lines, not just at CPUs north of $200.

    Originally posted by schmalzler View Post
    No, the demonstration just was, that those CPU-intensive tasks made KWINs Compositing perform bad. Everything runs fine without Compositing. I think the problem is, that the iGPU does not have its own memory (your 5450 has its own), and if the cpu pushes quite a lot of data through the memory bus, iGPU will suffer. Managing textures for many windows and of course rendering to pixmap for each cursor blink will probably go slower, hence the "lag". Turning off compositing and everything went fine, remember? Even the animation of yakuake goes fine: With compositing turned on, yakuake uses kwin for the animation, which goes not that smooth, when CPU is under extreme load; without compositing, it has to calculate it on its own - on the CPU! And that goes smooth as if there was no task running! So the "lag" is not a problem of bad CPU-Design, but of GPU not having its own, fast memory.
    No, the point was that it's not that hard to make all of your real and fake threads hit 100%. I think for your testing to have any real value, you'd have to try it on something other than the perpetually-broken KDE. However, I do stand by my point, I'd rather have 8 real-ish cores than 4 real and hyperthreading.

    Originally posted by schmalzler View Post
    BTW. I decided to go with SB, because Bulldozer consumes way too much power under load. If the powerconsumption was some percent above sandybridge, and performance 20% below, I would have bought a bulldozer (even if i would have had to go mATX or ATX instead of mITX as I did with Sandybridge) - but with THAT powerconsumption... Even if BD would perform better then sandybridge, I would not buy it :/
    That's really a poor rationale unless the machine you're running it on does rendering tasks 24/7 and will actually stay at that TDP constantly. If that's how you feel about it, why not take it one step further and get a 65w dual core? If, hypothetically speaking, my Bulldozer and your Sandy both spend 5 or 10% of the day at 100% load, and the rest of the time at idle, your power savings will be negligible at best. 95w is Intel marketing speak for "of course TDP went down if we did a die shrink, but still refuse to throw you a couple more cores." BTW, AMD does also have 95w 6 cores from this generation and even the 45nm generation.

    Leave a comment:


  • PsynoKhi0
    replied
    Originally posted by schmalzler View Post
    BTW. I decided to go with SB, because Bulldozer consumes way too much power under load. If the powerconsumption was some percent above sandybridge, and performance 20% below, I would have bought a bulldozer (even if i would have had to go mATX or ATX instead of mITX as I did with Sandybridge) - but with THAT powerconsumption... Even if BD would perform better then sandybridge, I would not buy it :/
    Oh well... This thread is off-topic to the point of no-return already, so: and whatever intel comes up with, I really doubt I'll ever buy anything from them, due to the strong-arming of OEMs that was going on during the Athlon64 days. No matter which camp you root for, that kind of backstage fuss benefits no one (aside from intel themselves of course...).

    Leave a comment:


  • schmalzler
    replied
    Originally posted by Sidicas View Post
    You know Intel doesn't put VT-X or other virtualization accelerations into their Desktop CPUs anymore?
    This is plain wrong!
    http://ark.intel.com/products/52213/...he-3_40-GHz%29
    Look down to the "Advanced Technologies".

    Originally posted by leeenux
    I cannot forsee a real world scenario that could bring my Bulldozer to it's knees, whereas you demonstrated that 3 simultaneous CPU-intensive tasks on your Intel CPU can grind the rest of your desktop to a halt.
    No, the demonstration just was, that those CPU-intensive tasks made KWINs Compositing perform bad. Everything runs fine without Compositing. I think the problem is, that the iGPU does not have its own memory (your 5450 has its own), and if the cpu pushes quite a lot of data through the memory bus, iGPU will suffer. Managing textures for many windows and of course rendering to pixmap for each cursor blink will probably go slower, hence the "lag". Turning off compositing and everything went fine, remember? Even the animation of yakuake goes fine: With compositing turned on, yakuake uses kwin for the animation, which goes not that smooth, when CPU is under extreme load; without compositing, it has to calculate it on its own - on the CPU! And that goes smooth as if there was no task running! So the "lag" is not a problem of bad CPU-Design, but of GPU not having its own, fast memory.

    BTW. I decided to go with SB, because Bulldozer consumes way too much power under load. If the powerconsumption was some percent above sandybridge, and performance 20% below, I would have bought a bulldozer (even if i would have had to go mATX or ATX instead of mITX as I did with Sandybridge) - but with THAT powerconsumption... Even if BD would perform better then sandybridge, I would not buy it :/

    Leave a comment:


  • Sidicas
    replied
    Originally posted by deanjo View Post
    Heck I usually have a couple of VM's (in VMWare) going while doing video encoding to H264 while doing HD editing in another application with numerous other applications running in the background with no "lag".
    Are you running a Sandy Bridge chip? You know Intel doesn't put VT-X or other virtualization accelerations into their Desktop CPUs anymore? So it's pretty much a given that AMD CPUs would be better for virtualization compared to the Intel equivalents.

    Leave a comment:


  • Sidicas
    replied
    The Bulldozer chips do very well against the Intel chips in integer performance... You certainly get your money there. Things like compiling apps and general desktop multitasking is helped a lot by integer performance and it's why Intel tries to cram Hyperthreading into their CPUs wherever they can (ie: LGA-2011). You can see that in Dhrystone benchmarks done by every review site.. The Bulldozer FX-8150 chips are no more than 10% slower in Integer performance than a 2600k and the 2600K is 18% more expensive ($50).


    What most review sites show as Bulldozer lacking on is it's floating point performance. Bulldozer is a bit weak in floating point because of the shared FPUs. To make up for the shortcoming in floating point, AMD built these Bulldozer chips to support FMA4 accelerations, an optimization that isn't available until apps are compiled with such optimizations.. Those benchmarks that show Phenom II being anywhere near the performance of Bulldozer is because they're running apps that haven't been compiled with FMA4 accelerations, it's as simple as that. When FMA4 accelerations is compiled into the binary, floating point on Bulldozer goes up a solid 30% across the board leaving Phenom II CPUs a long way behind.. You're not always going to see that 30% comparing the compiled binaries of Open64 v5 to Open64 v4, but they show up when you compare Open64v4 to -O2 GCC or Open64v5 to -O2 GCC for floating point apps. In Open64v4, Pov-Ray jumped up a solid 30% and in Open64v5 you can see some other floating point apps that didn't jump up 30% in Open64v4 to get their 30% boost in floating point performance in Open64v5 instead.

    If people want to pay 18% more money for <10% more performance, then that's up to them. Intel has been targeting the enthusiast market for a long time and they continue to do so. With AMD, you continue to get more bang for your buck, as has always been true for many years. You might have to jump through a hoop or two to get that floating point performance on these Bulldozer chips up (recompiling floating point heavy apps with FMA4 accelerations), but really there's not much there to argue about. Especially considering with OpenCL, floating point apps should be pushing their floating point work to the GPU as it's over 1000x faster at it. Even the Fusion Integrated GPUs (ie: Radeon 6550D) are dozens of times faster at floating point than the fastest $999 Intel CPUs are. The days of doing floating point on the CPU are coming to an end. A modern GPU has got several hundred shader "cores" that can all process floating point calculations in parallel, there's no reason to run them on a 4 or even 8 core CPU.
    Last edited by Sidicas; 11-26-2011, 09:37 PM.

    Leave a comment:


  • leeenux
    replied
    Originally posted by deanjo View Post
    No it is not a " a sweeping generalization" it is just plain fact. Your workload is nothing spectacular shouldn't show "lag" on any recent processor unless you have a real configuration problem. Heck I usually have a couple of VM's (in VMWare) going while doing video encoding to H264 while doing HD editing in another application with numerous other applications running in the background with no "lag".
    Your conclusion of Bulldozer is squarely at odds with the review done on this very website.

    This is starting to make a lot of sense:

    http://www.bbc.co.uk/news/technology-15869683

    Leave a comment:


  • deanjo
    replied
    Originally posted by leeenux View Post
    Wow, what a sweeping generalization... and a very misleading one at that.

    How about something like:

    "Unless you're building a PC to run the Cinnebench single threaded benchmark, you're better off with a Core2 Duo."

    Bulldozer did have some regressions, mostly in single threaded benchmarks. However, it's also faster than the Phenom II X6 in many single threaded benchmarks, and almost universally faster in well threaded benchmarks.

    I'm posting from an FX8120, and it feels faster than any Sandy Bridge, Nehalem or Phenom II I've ever used. I have the following windows open:

    Eclipse(EPIC-Perl)
    Netbeans(PHP)
    Firefox
    A Virtualbox VM running an Apache/PHP/Postgresql test server
    A Virtualbox VM running a SVN server
    PGAdmin3
    Several terminals
    Gedit
    ...and a few more random windows

    , and not ever a hint of lag, despite running 2 craptastic Java-based IDEs at the same time. I can even do something CPU intensive like creating a Truecrypt volume or compiling the Linux kernel, and still no slowdown whatsoever. I hate to break it to you, but a quad core Sandy Bridge cannot do all of those things and still be perfectly responsive, especially if you're using it's IGP.
    No it is not a " a sweeping generalization" it is just plain fact. Your workload is nothing spectacular shouldn't show "lag" on any recent processor unless you have a real configuration problem. Heck I usually have a couple of VM's (in VMWare) going while doing video encoding to H264 while doing HD editing in another application with numerous other applications running in the background with no "lag".

    Leave a comment:

Working...
X