Announcement

Collapse
No announcement yet.

Greater Radeon Gallium3D Shader Optimization Tests

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AnonymousCoward
    replied
    Originally posted by smitty3268 View Post
    The other thing to check is what the final clock rate is when in high profile. I think some of the amd integrated gpus are still limited to slower speeds until they get better PM, although i'm not sure if that includes your hardware or not.
    Yes, this might be it. /sys/kernel/debug/dri/0/radeon_pm_info shows this when 'High' profile is enabled:

    Code:
    default engine clock: 200000 kHz
    current engine clock: 200000 kHz
    default memory clock: 800000 kHz
    However, "rovclock -i" shows lots of different frequencies when I run it multiple times, up to about 600+ mhz (close to its maximum frequency) so I'm not sure if radeon_pm_info is not detecting the changed frequency, or rovclock is wrong:

    Code:
    Video BIOS signature not found.
    Invalid reference clock from BIOS: 387.96 MHz
    
    Video BIOS signature not found.
    Invalid reference clock from BIOS: 618.57 MHz
    Perhaps it's just random noise as it doesn't even seem to be associated with GPU load. Is there a better utililty/sysfs entry to look at?

    Leave a comment:


  • smitty3268
    replied
    Originally posted by AnonymousCoward View Post
    As for GPU profiles, I used DynPM and previous benchmarks did not differ from "High" profile, but good point, I'll retest to be certain.
    The other thing to check is what the final clock rate is when in high profile. I think some of the amd integrated gpus are still limited to slower speeds until they get better PM, although i'm not sure if that includes your hardware or not.

    Leave a comment:


  • AnonymousCoward
    replied
    Originally posted by brosis View Post
    Sorry, if you still can repeat the tests, could you try to force CPU governor to "performance" instead of "ondemand", because of this. I also hope you set GPU to profile/high before testing? A lot of nuances are still not ironed out as you see... We seriously need a power logic that does this by itself.
    I used performance, but to be fair ondemand hasn't shown any decrements on my particular system. I think because the HL2 games don't actually use that much CPU, or that ondemand actually works properly on my system for whatever reason.

    As for GPU profiles, I used DynPM and previous benchmarks did not differ from "High" profile, but good point, I'll retest to be certain.

    I did run a quick Windows 8 benchmark with SteamPipe, and performance was around 73/74 fps. So Linux looks a bit better when both systems use the new engine.

    I suspect it's as you say and software fallbacks are being used. I wouldn't be surpised if they are using some extensions that are just not supported on the Open Source ATI drivers, and perhaps not with FGLRX either (i.e currently nvidia specific extensions).

    I may run some more benchmarks, but I'm going to be quite busy from tommorrow until next Saturday, so might not have any time after today. Anyone who owns the Orange Box can run this bench, BTW, if you wish to check for yourself. It's in the Half Life complete collection as well, although it currently looks pretty expensive (40 USD).

    EDIT: Just ran with performance and "High" GPU profile and there was no change (HDR, Motion Blur enabled):

    R600 SB = 26.19 FPS

    I checked the commandline output and it reports a number of extensions that it doesn't support, including a summary at the end that is presumably the key extensions needed for the game:

    GL_NV_bindless_texture: DISABLED
    GL_AMD_pinned_memory: DISABLED
    GL_EXT_texture_sRGB_decode: AVAILABLE
    GL_NVX_gpu_memory_info: UNAVAILABLE
    GL_ATI_meminfo: UNAVAILABLE
    GL_MAX_SAMPLES_EXT: 8

    I seem to remember 'pinned memory' being a performance (but also stability issue) with FGLRX when enabled, but I believe it does increase performance when working correctly.
    Last edited by AnonymousCoward; 17 May 2013, 10:56 PM.

    Leave a comment:


  • brosis
    replied
    Originally posted by AnonymousCoward View Post
    Alright, last post for now. You can use the newer engine in Windows by opting into the SteamPipe Beta. A quick bench with Wine showed that performance was about the same as Linux, so that was probably the difference. Haven't tested on native Windows, but I suspect it will still be a lot faster.
    Sorry, if you still can repeat the tests, could you try to force CPU governor to "performance" instead of "ondemand", because of this. I also hope you set GPU to profile/high before testing? A lot of nuances are still not ironed out as you see... We seriously need a power logic that does this by itself.

    Leave a comment:


  • brosis
    replied
    Originally posted by AnonymousCoward View Post
    Alright, last post for now. You can use the newer engine in Windows by opting into the SteamPipe Beta. A quick bench with Wine showed that performance was about the same as Linux, so that was probably the difference. Haven't tested on native Windows, but I suspect it will still be a lot faster.
    You might be bumping into something that is using non-accelerated functionality.
    I'd start with tests such as Q3 or first half-life(the original clients for lin and win).
    I am also sure, Valve could answer a few questions...

    Leave a comment:


  • AnonymousCoward
    replied
    Alright, last post for now. You can use the newer engine in Windows by opting into the SteamPipe Beta. A quick bench with Wine showed that performance was about the same as Linux, so that was probably the difference. Haven't tested on native Windows, but I suspect it will still be a lot faster.
    Last edited by AnonymousCoward; 17 May 2013, 11:22 AM.

    Leave a comment:


  • AnonymousCoward
    replied
    Originally posted by AnonymousCoward View Post
    No, I'm running the native port. I may well test the Wine version to compare, though.
    Just tested with Wine. The default GLSL backend (in Wine) was crashing and had broken water output, so I used the ARB backend (which also tends to be faster):

    Wine ARB + SB = 32.24 FPS = 49% improvement

    So faster again, but very minor graphical glitches from what I can tell. Keep in mind there may be engine differences as well. Perhaps the Linux version is using a newer version of the HL2 engine, although graphically they don't look dramatically different.

    EDIT: Upon closer investigation, I do think that the Linux version is using a newer engine. I believe that it is actually using HDR for one, whereas the older engine just reports but doesn't use it.

    So I disabled HDR, Bloom and motion blur, and ran it again:

    Native Linux SB = 30.66 FPS

    That might not be enough for it to be comparable, however. I think we'll have to wait for the new engine to make it into the Windows version for these to be directly comparable. But yeah, the engine differences limit these results, obviously. Probably should have checked more closely in the first place.
    Last edited by AnonymousCoward; 17 May 2013, 10:49 AM.

    Leave a comment:


  • AnonymousCoward
    replied
    Originally posted by vljn View Post
    Do you use Wine to run HL2 Lost Coast ? (HL2 is ported to Linux, but I'm not sure for Lost Coast)
    No, I'm running the native port. I may well test the Wine version to compare, though.

    Leave a comment:


  • vljn
    replied
    Do you use Wine to run HL2 Lost Coast ? (HL2 is ported to Linux, but I'm not sure for Lost Coast)

    Leave a comment:


  • brosis
    replied
    Originally posted by AnonymousCoward View Post
    I've been running the video stress test with Half Life 2: Lost Coast and am getting some improvement with the SB backend, using my A10-4600M APU. These are an average of 3 replications for each configuration to minimise variance:

    Default backend: 21.62
    ...
    SB backend: 26.31

    Windows 8: 88.52
    I suggest some of the used functionality is still falling back to CPU. Would surely be nice if we could trace it :/
    Maybe it would be a good idea to test an array of applications (depending on GL level) on both platforms and report the deficiencies found.

    Leave a comment:

Working...
X