Page 1 of 4 123 ... LastLast
Results 1 to 10 of 31

Thread: Anyone with HD5870 or HD5850 using recent opensource driver and kernel?

  1. #1
    Join Date
    Apr 2010
    Posts
    1,946

    Default Anyone with HD5870 or HD5850 using recent opensource driver and kernel?

    I was wondering what is current realistic performance on this card using lastest stack?
    Lightsmark and Xonotic are of relevance for me.. 1600, 1920 or similar; high or ultra.

    I couldn't find any information on openbenchmarking or via google. The only part that comes close is this

    Anyone?

  2. #2
    Join Date
    Aug 2012
    Posts
    315

    Default

    Quote Originally Posted by crazycheese View Post
    I was wondering what is current realistic performance on this card using lastest stack?
    Lightsmark and Xonotic are of relevance for me.. 1600, 1920 or similar; high or ultra.

    I couldn't find any information on openbenchmarking or via google. The only part that comes close is this

    Anyone?
    in my knowledge the hd6970 series do have the best open-source performance from the AMD side right now.
    its because of the shader compiler and the hd6970 do have the most advanced and most "easy" to program VLIW shader core.
    Its still a full joke to the hd7000 series technically but the hd7000 series is not ready to use right now.

    In my benchmark research the hd4000 series is one of the worst performers and the hd6970 is the best right now.
    people with much lower hardware than your old hd4770 with the 6000series get much better benchmark results. Its just 3-4 times faster on similar hardware specifications. the performance difference is "criminal" they just betrayed there hd4000 customers.

    I personally would not waste my money on a death horse like VLIW ... you'll regret it because amd will make sure you'll regret it.

  3. #3
    Join Date
    Apr 2010
    Posts
    1,946

    Default

    Yes, I know, but I need some real-life FPS. Not much has changed since 5870.
    The GCN is in heavy works right now, unusable true,.. still.. based upon approximation from evergreen surfacing to its current support, it will take at least 1.5 years. A GCN card bought right now, in 1.5 years, is already outdated. This is why, for opensource GCN is excluded from the list.

    Also, evergreen still has, hardware-sided, good energy consumption (would become even better, if they finally switch to dynprofiles), good 3d performance, and ability to drive multiple displays which works and some opencl work is going on.

    Yes, this is exactly like 3 years ago, where AMD consumers were forced to purchase old cards if they want opensource, but at least the hardware stopped sucking now.

    A possible candidate is 5870 eyefinity version with 2GiB of vram.

  4. #4
    Join Date
    Aug 2012
    Posts
    315

    Default

    Quote Originally Posted by crazycheese View Post
    Yes, I know, but I need some real-life FPS. Not much has changed since 5870.
    The GCN is in heavy works right now, unusable true,.. still.. based upon approximation from evergreen surfacing to its current support, it will take at least 1.5 years. A GCN card bought right now, in 1.5 years, is already outdated. This is why, for opensource GCN is excluded from the list.

    Also, evergreen still has, hardware-sided, good energy consumption (would become even better, if they finally switch to dynprofiles), good 3d performance, and ability to drive multiple displays which works and some opencl work is going on.

    Yes, this is exactly like 3 years ago, where AMD consumers were forced to purchase old cards if they want opensource, but at least the hardware stopped sucking now.

    A possible candidate is 5870 eyefinity version with 2GiB of vram.
    "Not much has changed since 5870."

    this is wrong 5870 is 1 big(complex) 4 little(simple) VLIW shader units.
    a 6970 is 4 big complex shader units. without little simple shader units.
    in the reality they just use the 1 big shader unit and ignore the simple little simple shaders
    this means the hd6970 is easier to write a compiler for because you don't have to care about this.
    also the full utilization per shader cluster is higher in 4D-VLIW than in 5D-VLIW average is ~3,5
    this means you lose 5-3,5=1,5 for 5D VLIW and only 0,5 for 4D-VLIW

    in other words you are stupid if you think a hd5870 is a option instead if a hd6000 card.

    if a complex shader hits your pipe you do only have 1/5 performance with a 5D-VLIW architecture the 4D-VLIW do not have the same problem because there are 4 complex shaders instead.

    the catalyst do have some dirty shader compiler tricks with shader replacement to cheat with the replacement of complex shaders with simple shaders. thatís why the catalyst is so much faster.

    but even with the catalyst the 4D-VLIW win because of the average utilizatio of 3.5... this means 5D-VLIW worth nothing.

  5. #5
    Join Date
    Apr 2010
    Posts
    1,946

    Default

    Hmm, thanks!

    Maybe this is the cause for opensource radeon being slow? I remember someone (Marek?) said that "we don't have efficient shader compiler"...

    So opensource driver simply uses 1 of 5 units and the simpler are utilized randomly?

    If catalyst can break or cheat the complex into simpler, it will approach 3,5-4 (out of 5) load compared to 1-2.5(out of 5) load(performance) on opensource. Wild claim here..

    To support (or deny) this, one should extensively test 4D(0/4) hardware with opensource and catalyst, and compare it to good 5D(4/1) VLIWs.

    Test of any HD5xxx, HD64xx-68xx VS HD69xx (only three cards available, all high-end). If the performance of opensource and catalyst is much closer to each on HD69xx, then you are correct...
    Last edited by crazycheese; 09-15-2012 at 01:31 PM.

  6. #6
    Join Date
    Feb 2008
    Location
    Linuxland
    Posts
    5,332

    Default

    Indeed until recently, I believe the open driver only scheduled one of the five units. The LLVM VLIW packetizer should be enabled in current git, so it should now use more units, but it's likely not close to Catalyst-level efficiency.

  7. #7
    Join Date
    Oct 2007
    Location
    Toronto-ish
    Posts
    7,570

    Default

    Quote Originally Posted by curaga View Post
    Indeed until recently, I believe the open driver only scheduled one of the five units.
    IIRC the current (non llvm) shader compiler always used one ALU for each component the TGSI instruction was working on. For common vertex and fragment/pixel operations this usually meant 4 ALUs used in each instruction, but more complex shaders included relatively more single component operations or functions which only the single T unit could handle (integer ops, transcendentals etc..) and so the average ALU utilization went down.

    A more capable compiler could help with the first case by packing multiple single component operations into a single instruction, but couldn't do much about the second case where the T unit was required. That said the second case didn't happen too much for graphics -- it was mostly compute workloads that justified moving from 4 simple +1 special to 4 identical ALUs.

    Q's point about the Cayman shader core being able to execute 4 complex operations in a single instruction is correct in principle, however I don't believe the current compiler is able to pack multiple operations into a single instruction. Not sure if the llvm compiler paths are able to do that yet either.

    When we looked a couple of years ago the average open source compiler utilization was a bit under 3 while the proprietary shader compiler was a bit under 4. With the trend to more complex shaders I imagine both numbers have gone down a bit further since then.
    Last edited by bridgman; 09-15-2012 at 03:29 PM.

  8. #8
    Join Date
    Aug 2012
    Posts
    315

    Default

    Quote Originally Posted by crazycheese View Post
    Hmm, thanks!

    Maybe this is the cause for opensource radeon being slow?
    this sentence is wrong this is why hd2000-hd5000 is slow hd6000 4D-VLIW is much faster

    Quote Originally Posted by crazycheese View Post
    I remember someone (Marek?) said that "we don't have efficient shader compiler"...
    you need more you need a shader replacement per app to fix that.
    the catalyst do replace complex shaders with simple shaders for hd2000-hd5000 for every single app.


    Quote Originally Posted by crazycheese View Post
    So opensource driver simply uses 1 of 5 units and the simpler are utilized randomly?
    yes in my experience they only use the 1 complex FULL shader because they do not have shader replacement per app technology right now. only some apps do have "Simple" shaders for AMD then they can use more than 1 shader. but the lag of shader compiler cause that the app uses 100 shaders and your graphic card do have 700 shaders then the graphic card only use 100 shaders because there is no compiler to use the 700 shaders-







    Quote Originally Posted by crazycheese View Post
    If catalyst can break or cheat the complex into simpler, it will approach 3,5-4 (out of 5) load compared to 1-2.5(out of 5) load(performance) on opensource. Wild claim here..
    the average for catalyst is 3,5 catalyst use 3,5 of 5 per group for 5D VLIW and 3,5 of 4 for 4D VLIW
    this means 5D VLIW is useless even for the catalyst.

    yes in the worst case the opensource driver use 1 of 5 for 5D VLIW and with a shader compiler in the future 3,5 of 4 for the 4D VLIW

    now you get what? hd2000-hd5000 will never ever get a improvement because you need a shader replacement to load the 4 simple shader units per group.

    this means for the open-source driver its complete stupidity to buy a hd5000 because you will not get any improvement because they will not build a per app shader replacement infrastructure.

    you will get a good result with a hd6970 with the future shader compiler.

    Quote Originally Posted by crazycheese View Post
    To support (or deny) this, one should extensively test 4D(0/4) hardware with opensource and catalyst, and compare it to good 5D(4/1) VLIWs.
    no need to test these stuff are technical facts.
    they switch from 5D to 4D and complex shader only because even the catalyst CAN NOT HANDLE THIS COMPLEXITY of shader replacement and compiling for 2 different kind of shaders.


    Quote Originally Posted by crazycheese View Post
    Test of any HD5xxx, HD64xx-68xx VS HD69xx (only three cards available, all high-end). If the performance of opensource and catalyst is much closer to each on HD69xx, then you are correct...
    no there are low-end 4D VLIW ... the second generation of APUs are all 4D VLIW. all FM2 based systems and notebooks.

    a example of lowend 4D VLIW: AMD A6-3420M AMD Radeon HD 7470M

  9. #9
    Join Date
    Oct 2007
    Location
    Toronto-ish
    Posts
    7,570

    Default

    Quote Originally Posted by necro-lover View Post
    the catalyst do replace complex shaders with simple shaders for hd2000-hd5000 for every single app.
    No

    Quote Originally Posted by necro-lover View Post
    the average for catalyst is 3,5 catalyst use 3,5 of 5 per group for 5D VLIW and 3,5 of 4 for 4D VLIW
    this means 5D VLIW is useless even for the catalyst.
    No - the average number of ALUs used per instruction is a bit lower for VLIW4 than for VLIW5, since there are a number of cases where the shader compiler could pack a single component operation into the same instruction as a 4-vector operation. The point was that the utilization as a percentage was slightly better with VLIW4.

    In general VLIW5 was better for pure graphics workloads, but as compute became a larger part of GPU workload (there's a lot of compute hidden in modern graphical apps as well) then VLIW4 became a better fit.
    Last edited by bridgman; 09-15-2012 at 03:36 PM.

  10. #10
    Join Date
    Aug 2012
    Posts
    315

    Default

    Quote Originally Posted by bridgman View Post
    Q's point about the Cayman shader core being able to execute 4 complex operations in a single instruction is correct in principle, however I don't believe the current compiler is able to pack multiple operations into a single instruction. Not sure if the llvm compiler paths are able to do that yet either.

    When we looked a couple of years ago the average open source compiler utilization was a bit under 3 while the proprietary shader compiler was a bit under 4. With the trend to more complex shaders I imagine both numbers have gone down a bit further since then.
    my argument was also a "Future" argument *in the Future the 4D VLIW is much better than the old 5D VLIW cards*

    I just don't want him to buy a hd5000 card because its technically bullshit in a modern world of more and more complex shaders.

    If he buy a hd7970 he buy 4 shaders and use 3-3.5 then he lost only 1-0,5 instead of you buy 5 shaders and only use 2-3 and lost 2-3

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •