Announcement

Collapse
No announcement yet.

Greater Radeon Gallium3D Shader Optimization Tests

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AnonymousCoward
    replied
    Originally posted by smitty3268 View Post
    So, just to summarize:

    r600g default = 21.5fps.
    r600g + optional optimizations in mesa master using environment variable + recompiling kernel to remove PM limitations = 64.5 fps.
    windows default = 73.5 fps.

    That's not bad at all. We just need to get some of these optional/experimental things into shape and enabled by default now.
    Yeah, pretty much. I'd need to run a few more benchmarks on both Windows 8 and Linux and make sure settings were exactly the same, but I don't expect they'd be much variation other than a couple of fps here and there.

    Although this is true for source games. Serious Sam 3 for instance ran awfully and looked graphically off, and Brutal Legend tended to lock up after a bit of play. This could be true on Windows as well, though, I haven't tested yet.

    I will say that performance was likely to be superior to FGLRX based on memory from the performance I was getting when it was working. For example, loading up a save of Last Remnant running under wine gives me performance of 22 FPS when using R600 SB + high profile, where as I was getting something 11-14 FPS with FGLRX. Portal runs about the same as Lost Coast mostly hitting slightly above 60 fps, whereas it was more like 30-40 with FGLRX, even when using the discrete card (which ran even worse than the integrated card, BTW).

    Overall, at least in a best case scenario the open souce driver looks very good. Just need better power management and getting some more extensions supported I think.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by AnonymousCoward View Post
    EDIT: Just tried and it's a definite improvement:
    So, just to summarize:

    r600g default = 21.5fps.
    r600g + optional optimizations in mesa master using environment variable + recompiling kernel to remove PM limitations = 64.5 fps.
    windows default = 73.5 fps.

    That's not bad at all. We just need to get some of these optional/experimental things into shape and enabled by default now.

    Leave a comment:


  • brent
    replied
    Originally posted by AnonymousCoward View Post
    Cool, I'll try it out. What does it do? And thanks for the troubleshooting help in general, all of you.
    If a request to change the clock exceeds the so-called "default clock", it is simply set to the default clock instead, which is quite low on APUs. This limit is probably in place to make sure the thermal specification is never exceeded, even without active thermal management. Desktop GPUs usually have the highest clock as the default clock, so there is no such problem with them.

    In my experience, power consumption and heat only increase slightly with this hack, and if the device has suitable cooling, there won't be a problem. AFAIK, in theory the APU's design requires active thermal management though, i.e. the driver has to monitor the temperature and reduce the clock if it gets too hot. Since that hasn't been implemented yet, the driver stays on the safe side and doesn't allow high clocks.

    I'd *love* a driver option in the vanilla kernel that allows users to use the full clock range at their own risk.

    Leave a comment:


  • AnonymousCoward
    replied
    My mistake, "High" profile appears to work with that patch, but dynpm doesn't work at all.

    Leave a comment:


  • AnonymousCoward
    replied
    Originally posted by brent View Post
    Mobile APUs are definitely clock-limited, no question about it. You have to modify the kernel to workaround. Delete this if-block.
    Cool, I'll try it out. What does it do? And thanks for the troubleshooting help in general, all of you.

    EDIT: Just tried and it's a definite improvement:

    R600 SB = 41.51 FPS

    However, the frequency is higher but doesn't appear to be the maximum:

    default engine clock: 200000 kHz
    current engine clock: 334880 kHz
    default memory clock: 800000 kHz

    Or can I not really trust the specific reading?

    EDIT: Using dynpm again appears to use switch to maximum frequency:

    default engine clock: 200000 kHz
    current engine clock: 685710 kHz
    default memory clock: 800000 kHz

    R600 SB = 64.59 FPS

    Now we are talking, comes close to Windows 8 now!

    Thanks a lot everyone, hopefully we'll get true power management eventually but this will do me for gaming.
    Last edited by AnonymousCoward; 18 May 2013, 10:07 AM.

    Leave a comment:


  • agd5f
    replied
    Originally posted by AnonymousCoward View Post
    One more bit of info: Steam is detecting only 268.44 MB of VRAM, when in theory it should be supporting 512MB. This could account for some of the performance issue, especially since I run games at 1920x1080. This thread discusses the issue: http://steamcommunity.com/app/221410...8532588748333/

    EDIT: Nevermind, probably just an issue with the way Steam detects available ram according to that thread.
    OpenGL does not provide a standard way for apps to query the amount of memory available. There have been several proposals, but nothing has come of it so far. Apps end up having to do their own hacks to guess how much memory is available.

    Leave a comment:


  • agd5f
    replied
    Originally posted by AnonymousCoward View Post
    However, "rovclock -i" shows lots of different frequencies when I run it multiple times, up to about 600+ mhz (close to its maximum frequency) so I'm not sure if radeon_pm_info is not detecting the changed frequency, or rovclock is wrong:
    Don't use rovclock. It only supported early radeons (r1xx-r3xx) and even then it didn't properly handle all the pll dividers. On newer radeons it's just reading garbage.

    Leave a comment:


  • brent
    replied
    Mobile APUs are definitely clock-limited, no question about it. You have to modify the kernel to workaround. Delete this if-block.

    Leave a comment:


  • AnonymousCoward
    replied
    One more bit of info: Steam is detecting only 268.44 MB of VRAM, when in theory it should be supporting 512MB. This could account for some of the performance issue, especially since I run games at 1920x1080. This thread discusses the issue: http://steamcommunity.com/app/221410...8532588748333/

    EDIT: Nevermind, probably just an issue with the way Steam detects available ram according to that thread.
    Last edited by AnonymousCoward; 18 May 2013, 05:31 AM.

    Leave a comment:


  • scxe
    replied
    AnonymousCoward here: I think that this is the issue, the performance difference certainly makes sense. I'm not sure why it didn't occur to me actually, as I remember nouveau having the same isssue. I tried to force "Low" profile to test, and it doesn't seem to lower performance. So either the driver is severely bottlenecking the card, or power management is broken. Here's radeontop output in case that is helpful. I suspect there would be high CPU load rather than GPU load if the driver was bottlenecking the card (cpu load is relatively low):

    Code:
     radeontop v0.6-4-g244c88e, running on ARUBA, 120 samples/sec                                                                            
                                                                                                               |
                                                                                         Graphics pipe 100.00% |                                                                                                       
    -----------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------
                                                                                          Event Engine   0.00% |
                                                                                                               |
                                                                           Vertex Grouper + Tesselator  55.00% |                                                        
                                                                                                               |
                                                                                     Texture Addresser  90.00% |                                                                                            
                                                                                                               |
                                                                                         Shader Export  96.67% |                                                                                                   
                                                                           Sequencer Instruction Cache  94.17% |                                                                                                
                                                                                   Shader Interpolator 100.00% |                                                                                                       
                                                                                                               |
                                                                                        Scan Converter  99.17% |                                                                                                      
                                                                                    Primitive Assembly  56.67% |                                                          
                                                                                                               |
                                                                                           Depth Block  98.33% |                                                                                                     
                                                                                           Color Block  94.17% |

    Leave a comment:

Working...
X